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THE INTERPRETATION OF BURT’S REGRESSION 
EQUATION 


KARL J. HOLZINGER AND FRANK N. FREEMAN 
University of Chicago 


The object of the present note is to question the interpretation of 
a regression equation worked out by Mr. Cyril Burt in his book on 
‘Mental and Scholastic Tests.’”’ The equation has received wide 
notice and has frequently been quoted to show that our best intel- 
ligence tests measure intelligence to the extent of about 30 per cent, 
while they measure school attainment to more than 50 per cent. 

The variables in Mr. Burt’s equation were as follows: 

1. Mental age on a modification of the Binet-Simon Scale. 

2. School attainments expressed in terms of educational age. 

3. A reasoning test in age units, and regarded by Mr. Burt as a 
measure of intelligence. 

4. Chronological age. 

The regression equation for predicting Binet Score reads 

Binet = .54 School Work + .33 Intelligence +- .11 Age’ 


This is interpreted by Burt to mean that “‘of the gross result, then, 
one-ninth is attributable to age, one-third to intellectual development, 
and over one-half to school attainment. School attainment is thus 
the preponderant contributor to the Binet-Simon tests. To school 
the weight assigned is nearly double that of intelligence alone, 
and distinctly more than that of intelligence and age combined. 
In determining the child’s performance in the Binet-Simon Scale, intel- 
ligence can bestow but little more than half the share of the school, and 
age but one-third the share of intelligence. 

“‘Imagine two children, aged 7 and 17 respectively, both possessing 
an intelligence equally normal, neither having passed a single hour in 

1 Burt, C.: ‘Mental and Scholastic Tests,’”’ P. S, King and Son, London, 
1921, p. 183. 
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school. The younger, as a consideration of the several tests will show, 
might reach a mental age of 6; the older despite 10 years of seniority, 
barely that of 9, so barren is growth deprived of opportunity.”’? 
Upon examining these statements it appears that Mr. Burt 
regards the three regression coefficients as parts of a whole (in this 
equation they add up to .98 by mere chance) and that he adds and 
otherwise compares them directly with one another. Such com- 
parisons would seem to be justified on the grounds that the variables 
are all in age units. This enables one to say that a year’s increase 
in school attainment is accompanied on the average by .54 of a year 
of increase in Binet Score, while a year’s increase in “‘intelligence’’ is 


‘ accompanied by only .33 of a year of mental age, the remaining 


variables on the right being held constant each time. 

This sort of argument is of course only valid in case Mr. Burt’s 
reasoning test is a pure measure of intelligence. Professor Godfrey 
Thomson? in his new book on educational psychology declares that 
certain critics have been unjust in doubting Mr. Burt on this point 
and that his reasoning test is a “‘measure of ‘native intelligence’— 
entirely independent of schooling,” a conclusion which is reached by 


scrutiny of the following table of correlation coefficients taken from 
Burt’s study. 


TaBLeE 1.—OBSERVED AND ParTIAL CORRELATIONS BETWEEN AGE, INTELLI- 
GENCE, ScHooL ATTAINMENTS, AND THE RESULTS OF THE BINET-SIMON TESTS 





























Partial Partial 
Factors og Factor ns Factors — 
correlated : eliminated eliminated ™ 
cients (first (second 
order) order) 
Tests and school work. .91 ‘| Intelligence...... .78 Intelligence and age. . 61 
ns adih odd « .68 
Tests and Intelligence. -84 School work.... .58 School work and age. .56 
Pietednveese .65 
Tests and age......... .83 School work... .. .19 School work and in- 
Intelligence...... .62 telligence........... 13 
School work and intel- . ena r — .06 Tests and age........ — .07 
ligence.........++.-- 75 Br aivccesdeaea -40 
School work and age... .87 Tivteeseshes .49 Tests and intelligence. .49 
Intelligence...... -73 
Intelligence and age... -70 Sr .01 Tests and school work. .05 
School work..... -15 
1 Op cit., p. 183. 


2 Thomson, Godfrey: ‘Instinct, Intelligence, and Character.” Longmans 
Green, 1925, p. 217. 
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Mr. Burt previously argued that ‘“‘ With both age and intelligence 
constant, the partial correlation between school attainments and Binet 
results remains at .61. Of all the partial coefficients of the second 
order this is the largest. There can therefore be little doubt that with 
the Binet-Simon Scale a child’s mental age is a measure not only of the 
amount of intelligence with which he is congenitally endowed, not only 
of the plane of intelligence at which in the course of life and growth he 
has eventually arrived; it is also an index, largely if not mainly, of the 
mass of scholastic information and skill which in virtue of attendance 
more or less regular, by dint of instruction more or less effective, he has 
progressively accumulated in school.’’! 

The simple table of correlation coefficients seems hardly to justify 
this flight of rhetoric. 

Mr. Thomson now says that any one skilled in correlational 
mathematics can see that Burt’s intelligence test is free from the above 
defect and is quite independent of school work. This argument would 
seem to follow from the partial correlation of —.07 between school 
work and “‘intelligence” for Binet and age constant. Reference to 
the table also shows that holding only Binet constant reduced the 
correlation between school work and intelligence from .75 to —.06. 
Holding age constant in addition reduced this result inappreciably 
to —.07. Furthermore the correlation between intelligence and age is 
changed from .70 to .01 by holding Binet constant. Whatever correla- 
tion “intelligence” has with school work and age is thus due to Binet, 
for when these tests are fixed the association becomes negligible. 
Finally, no matter which variables are fixed the correlation between 
“intelligence’”’ and Binet remains substantial (.56), showing that 
these two tests index the same mental function to a considerable 
extent. This we venture is the sort of argument Professor Thomson 
expects us to make in reaching the conclusion that Burt’s reasoning 
test is a measure of native intelligence free from schooling. 

On first inspection this argument appears quite sound, but upon 
more careful scrutiny of the variables involved it does not seem so 
convincing. Using the notation given on the first page of this note 
we may consider the results 


To3 = +.75 and re3-1 = — .06 


If Binet is a measure of both school attainment and intelligence, as 
Burt and Thomson say, then the correlation between these two vari- 


1 Op cit., p. 182. 
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ables for Binet constant must of necessity be small because the 


quantities correlated are at the same time both being restricted. It 
is true that the correlations, 


rye = .91 and riz.3 = .78, 


show that Binet is a better predicter of scholastic achievement than 
the reasoning test, but nothing in the above data or coefficients 
warrants the conclusion that one test is a measure largely (if not 
mainly) of scholastic information, while the other is a pure test of 
native intelligence. The difficulty in part may be that both writers 
have treated the variables as pure measures of certain traits in order 
to show that one of them is pure and the other mixed. As a matter 
of fact school attainment, Binet score, and reasoning ability are not 
pure measures of any traits, and partial correlations between them 
yield results without much practical meaning. 

Upon examining the term “schooling” as used by Burt and 
Thomson certain other ambiguities arise. Burt describes schooling 
in terms of educational age or achievement relative to pupils of the 
same chronological age. Thomson uses schooling in Burt’s sense in 
discussing its independence from the latter’s intelligence test, but in 
the next paragraph (p. 217) employs the term with an entirely different 
meaning. He is discussing the effect of ‘‘absence of schooling,” 
or lack of educational experience, upon the IQ’s of gypsies and canal 
boat men, and suggests that Burt’s test should have been used because 
of its independence of schooling (in the first sense). 

The relative achieyement of children who have been in school 
the same number of years, and the total years of schooling or educa- 
tional experience each has had, are quite different things and it is 
confusing to the present issue to call them both by the same name. 
Burt’s data apply only to schooling according to his definition of the 
term, and conclusions ‘regarding the relation of his test to schooling 
as total educational experience are unwarranted. 

A test independent of total educational experience is undoubtedly 
a very desirable thing, but a test independent of scholastic attainment 
(say educational age) would seem most undesirable, for intelligence 
to achieve is the only intelligence worth while. Binet predicts 
conspicuous scholastic achievement, for children of the same age, 
better than does Burt’s reasoning test, and on these grounds is a better 
intelligence test. Whether or not this result is due to common 
elements in the educational and Binet test remains to be shown. 
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In the hope that some further light might be thrown on Burt’s 
regression equation and its interpretation, the other three equations 
were worked out. This was accomplished by an indirect process 
because the standard deviations of zero order were unknown. 

Using the conventional subscript notation, Burt’s equation may 
be written 
(1) Ly = Dy2.34%2 + B43. 24%3 + Dig.osts 


o 
- Baste + 5 pists + 7 Bute 
V1—- risV1 — 7714.3 


where Bi2 = 112.34 Wi -Pfevi—- 
_ 23 ™ © wee 


The next equation required is 











etc. 





02 02 02 
Le = —Boit1 + —Basts + —Basty 
Ci C3 04 


for which the ratios of the sigmas are required, the 6’s being readily 
obtained from the coefficients in Table I. 
It is apparent that 


o2 _ Biz oe bis.24812 —_. b14.23812 


a, bi2.34 ¢3 12.3481 a, dis.ssBrs 


Working out these values and the new §’s required gives the 
desired equation, 





(2) Ze = .6927,; — .0527; + 47x, 
In a similar way, the remaining equations may be found, 
(3) Z3 = .95xz, — .10z2 + .O7%,, 
and 
(4) tq = 152, + .5lz, + .032z; 


The above calculations were carefully checked throughout, and 
may be readily verified by the reader who is familiar with multiple 
regression coefficients. 

Turning first to equation (4) we shall attempt to interpret it in 
the way in which Mr. Burt has interpreted equation (1). It appears 
at once that over half of a child’s age is ‘‘attributable” to school 
attainment. This is truly alarming. We had always supposed that 
age was a comparatively simple thing, when it could be discovered, 
but now we find that there can be little doubt that age is a measure 
not only of the amount of age with which a child is congenitally 
endowed—but it is also an index, largely, if not mainly, of the mass 
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of scholastic information and skill which in virtue of attendance 
more or less regular, by dint of instruction more or less effective, he 
has accumulated in school. Isolated from scholastic progress and from 
development in mental age, intelligence subscribes but a paltry portion. 
Indeed, if the child were removed from school and his mental age 
taken away from him, he would probably not get old at all. The secret 
of eternal youth has at last been discovered! 

Some of the other regression coefficients are also puzzling. From 
equation (1) it appears that one-third of Binet is “attributable” 
to intelligence while equation (3) shows that nearly all of intelligence 
is ‘attributable’ to Binet. Now “attributable” cannot mean “share” 
or ‘‘portion,’’ or “part” in the ordinary mathematical sense, for if 
it does we arrive at an inconsistency. Thus we may write 


Binet = 14 Intelligence 
Intelligence = %o Binet 


from which we find that 3 = 10. Aside from this inconsistency, it is 
a little hard to see why intelligence which is over 90 per cent Binet 
shouldn’t produce about the same results as the latter test. 

In a similar way equation (2) may be employed to show that two- 
thirds of school work is due to Binet, while ‘‘intelligence”’ interferes 
to aslight extent with schoolattainment. By previous results one-third 
of Binet was due to intelligence, so that it looks as if two-ninths of 
school attainment should be due to intelligence. Instead of this the 
two interfere slightly. This is again disquieting. 

It may finally be noted that the regression coefficients show merely 
the average change in the dependent variable for a unit change in 
the independent variable to which they are attached, the remaining 
variables being held constant. To interpret them as representing 
the ‘‘parts” or “portions” of the independent variables which go to 
make up the predicted variable is unwarranted. Such misconceptions 
may arise in part from regarding the correlations as causal relation- 
ships, whereas the association may be entirely due to the influence 
of common variables not directly measured. 
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AN EXPERIMENTAL STUDY OF THE NATURE OF 
IMPROVEMENT RESULTING FROM PRACTICE 
IN A MENTAL FUNCTION! 


ARTHUR I. GATES AND GRACE A. TAYLOR 


Teachers College, Columbia University 


In another article? the writers discussed the need of determining 
more exactly the nature of improvement which results from practice 
and presented the results of a study in which a motor function—speed 
of tapping—was utilized. The present study, conducted along lines 
similar in most respects, utilizes a mental function and thus approaches 
more closely the problem of the nature of changes in human na- 
ture brought about by education in the more- typically intellectual 
functions. 

Concerning the character and limits of improvement which 
continued training may produce, there have been, traditionally, two 
main theories to which must now be added a third. 


1. THEorY OF ACQUIRED SPEciFic TECHNIQUES AND KNOWLEDGE 


According to this theory, education or training produces improve- 
ment which is due to the development of particular skills and 
information, methods of work and knowledge used in working, subtle 
adaptations to the working conditions, ‘the tricks of the trade’’—all 
these without any changes in the fundamental ‘“‘capacities’” engaged in 
the function. Thus in studying arithmetic, improvement is said to be 
due to knowledge of numbers and operations, speed and accuracy in 
handling these particular data, improved methods of writing and 
taking tests without any increase in ease of learning, in retentiveness 
or in memory or reasoning, or in any of the neural and other machinery 
involved in the tasks. The investigators who conceive ‘‘intelligence”’ 
to be a capacity or group of capacities which cannot be appreciably 
changed by education and experience usually subscribe to this view 
in some form. 


2. THEORY OF IMPROVABLE CAPACITIES OR MECHANISMS 


A second type of view affirms that practice may result not only in 
the acquisition of specific information and skill but also in the improve- 





1 From the Research Department of the Horace Mann School. 
2An Experimental Study of the Nature of Improvement Resulting from Prac- 


tice in a Motor Function. Journal — Psychology (forthcoming). 
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ment of the neural and other machinery or in the “capacities” 
involved. The older faculty theory, which assumed that memory, 
perception, retentiveness, etc. were improved, in general, by practice 
in some particular type of training which involved the general power, 
has been modified, usually, in the face of the facts yielded by studies 
of the transfer of training, but newer views, differing only in degree, 
are still defended and defensible. To illustrate, it is possible to main- 
tain that the training of a function brings about an improvement in 
the neurones or other mechanisms involved and that these, funda- 
mentally improved, may operate in other functions which bring them 
into action. The observed transfer of training, in other words, may 
be due not wholly to the réle of identical information, methods of 
work, techniques, etc. but partly, or mainly, to the influence of 
identical narrow capacities or bits of machinery which operate in 
various combinations in different gross functions. Views of this 
sort appear to be held by certain writers who are opposed to the 
notion that learning capacity, general intelligence and the like cannot 
be effectively improved. 


3. THEORY OF STIMULATED GROWTH 


Recent work on the nature of growth and the factors which influ- 
ence it suggest another view. Since such capacities as retentiveness, 
speed of motor response, intelligence, etc. are believed to grow gradu- 
ally from birth to maturity, given a normal environment, it is con- 
ceivable that continued, intensive practice preceding maturity might 
stimulate and increase the rate of growth of the capacities exercised. 
Vigorous and persistent training might, by means of nutritive after- 
effects, by increasing the production or changing the distribution of 
“hormones” or in other ways now unknown, accelerate the process 
of growth and carry it on to higher levels before maturation. If such 
were the facts, the importance of early training and the biological 
significance of the period of infancy would receive new emphasis. 


THE PLAN OF THE EXPERIMENT 


The present experiment was designed to test, in some measure, the 
nature of improvement in a mental function. For this purpose it 
seemed advisable to use as subjects persons in whom growth was by 
no means completed and in whom growth was going on, presur.ably, 
with great rapidity and a mental function which had been little prac- 
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ticed and in which, presumably, the acquisition of technique would 
be of rather small amount. After some preliminary study, we selected 
as subjects a group of children, ages 4 to 5.8 years from the kinder- 
garten of the Horace Mann School and as a function, memory for, 
series of digits presented orally. Since this function constitutes one 
of the recurring tests in the Stanford-Binet Scale, the results should 
throw some light upon the nature of the functions which are accepted 
as criteria of intelligence. 

The experimental procedure called for two groups of subjects 
equivalent in abilities related to memory for digits, one of which was 
to be trained intensively and the other tested only at the beginning 
and end of the practice period. From a larger group of children, two 
groups were made up by matching each child in one group with a 
child in the other as nearly as possible the same in each of the 
following traits: 

Sex. 

Age. 

Mental age on the Stanford-Binet test. 

Intelligence quotient. 

Scholastic maturity as judged by teachers. 

Memory for digits, presented orally, 

Memory for letters, series presented orally. 

Memory for series of unrelated words, presented orally. 
Memory for series of related words, presented orally. 

10. Memory for a series of 10 geometrical figures, each presented 
visually for 5seconds. After all were presented, test of the recognition 
type—originals mixed with new designs—was given. ‘Tests were 
given to pupils individually. 

11. Memory for seven pictures—boy, hat, cap, etc. presented 
visually on a card strip for 15 seconds. Test by recognition method. 
All tests given individually. 

12. Memory for picture “names.” A series of 10 drawings of 
common objects with its common name under it in type. Task is to 
learn the “‘name”’ of each picture. Each card presented 5 seconds: 
test by selecting words studied from a group including the old and new 
ones. Individual tests. 

Two groups of 16 pupils, approximately matched in these traits 
and abilities, completed the experiment. The averages of these groups 
in the measurable traits are shown in Table I. It may be seen inthe 
table that, in the averages, the two groups are substantially equivalent 
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at the beginning of the experiment, in the important traits taken into 
consideration. 


TaBLeE I.—SHowING AVERAGE Scores OF PRACTICE AND CoNTROL GROUPS IN 
THE INITIAL TEST 




















ob | 
| Ziadie | | 
; | ° ° — 
Traits > |b PEIP SE Sani Zz | 
“| 7] «| Bsl8slsel8eleele igs 
a he ol SRiSGSiaeeleei85\3 (26 
SSIES) oe (SSISSISBISE SHE BE 
< = ee ee ee ed Cee | 
Practice group.....| 5.1 | 6.31) 122 | 4.33) 3.64] 3.86) 14.0) 4.3 | 5.3 5 
Control groups....} 5.1 | 6.35) 123 | 4.33) 3.71 13.7) 4.0 | 5.7 0 



































1 Age and mental age as of Oct. 1, two months before the study was begun. 


Tue Resvuuts or Speciric PRACTICE IN MEMORY FOR DicITs 


Beginning shortly after the completion of the initial tests, the pupils 
in one group were given, individually, on each available school day, 





Fia. 1. 


practice in immediate memory for digits. A large number of series 
of digits, arranged by chance, were prepared and presented according 
to the method prescribed in the Stanford-Binet scale. Each day, 
the pupil began with a series shorter than the largest one on which he 
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had succeeded in two out of three trials on the preceding day. Each 
child, then, on each day was tested on three series of each length from 
a short series to a series on which he failed in two out of three attempts. 
The results, presented in Table II and shown graphically in Fig. I, 
are based on the number of digits in the largest series in which the 
subject was successful in two out of three attempts. In the group of 
16 from which the averages were computed, all attended fairly regu- 
larly. In the case of absences, the pupil was given the score earned 
on the preceding test. The training was continued until May 20, 
the end of the school year, 78 days of practice. 


TaBLE II.—SHOWING THE AVERAGE LENGTH or DiGIT-SERIES RECALLED IN Two 
out oF THREE TRIALS FOR THE Practice Group or 16 SuBJECTS 
78 Days or PRACTICE 





Day 1};2;3)4]5)]6)]7)] 8] 9 | 10) 11) 12) 13) 14] 15] 16] 17] 18] 19) 20 





Average digits... .|4.3/4.3)/4.2)4.4/4.5)4.4/4.5/4.5)4.7|4.5)4.6/4.7|4.6/4.7/4.8/4.7|4.8]4.8/4.8/4.9 


ou 

















Day 21 | 22 | 23 | 24/ 25} 26 | 27 | 28 | 29 | 30/ 31 | 32 | 33 | 34| 35 | 36 | 37 | 38 | 39 | 40 





Average digits. . ./5.0)5.0/5.1/4.9/5.0)5.1/5.0)5.1/5.2)5.2/5.1/5.3)5.3/5.3)5.2/5.4|5.5)5.4/5.5)5.5 








Day 41 | 42 | 43 | 44) 45/| 46 | 47 | 48 | 49 | 50; 51 | 52| 53 | 54 | 55) 56| 57 | 58 | 59 | 60 








Average digits. . .|5.6)5.4)5.6)5.7|5.7/5.8/5.8/5.7|5.9/6.0/5.9)5.9]/5.916.0)6.3)5.8/6.0|6.2/6.0/6.0 

















Day 61} 62) 63) 64) 65 | 66 | 67| 68) 69 | 70) 71) 72) 73) 74) 75) 76) 77 | 78 





Average digits.......... 6.1)/6.2|6.3/6.3)6.3/6.3/6.4/6.3/6.3/6.3/6.4/6.5)6.3/6.4/6.5)6.4/6.4/6.4 



























































The trained group progressed steadily from an initial score of 4.33 
to a final average score of 6.40 digits—a gain of 2.07 digits. In the 
Stanford-Binet Scale, 4 digits is placed at year 4, and 6 at year 10. 
The practice group, then, advanced during a period of 4.5 months 
during which they practiced on 78 days, an amount equal to that 
which the average untrained child advances in approximately 6 years. 

The control group was given the test on the first and last of the 78 
days. The average score on the first test was 4.33 and on the last 
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5.06, a gain of 0.73 digits. The gain of the practice group is appreciably 
greater than that of the untrained group. 


— 
THE INFLUENCE OF DISUSE ON IMPROVEMENTS 


The problem, now, is to determine whether the improvement in the 
case of the trained group is due entirely to acquired techniques in 
handling digits and adjusting to the test conditions, or partly or wholly 
due either to the improvement of capacities underlying immediate 
memory for digits by some direct means or to capacities improved 
indirectly by a stimulation of growth, or to both. We therefore 
decided to apply, as one means of discovering the character of the 
improvement, a test of the influence of disuse. 

On Oct. 10, 1924, approximately 4.5 months after the final practice 
day in May—a period approximately equal in length to the practice 
period—14 pairs of the original group of 16 pairs were found and 
tested. The results for the two groups were as follows: 








TaBLeE III 
Practice | Control 
group group 
Average score, initial tests, Dec. 20, 1923............... 4.36 4.41 
Average score, final test, May 20, 1924................. 6.36 5.08 
Gain, during period of practice.....................44. 2.00 0.67 
Average score, test after disuse, Oct. 10, 1924........... 4.71 4.77 
ect ce sh divcekcbendume seats 0.35 0.36 











This result is most significant. While the practice group at the 
end of 4.5 months of training excelled the control group by an appreci- 
able amount—an amount equal to about 4 years of average growth 
according to the Stanford Scale—after 4.5 months of disuse, the 
advantage has been lost completely, the two groups were as nearly 
equal as they were at the beginning of the study. 

As we see it, the experiment indicates that in the case of memory 
for oral digits among these children, at least, improvement brought 
about by 4.5 months of practice, while appreciable, is due not to 
capacity increased directly or indirectly by means of accelerated 
growth but exclusively to the acquisition of technique. What con- 
stitutes technique, the experiment does not disclose. The factors 
may be better habits of attention under test condition, adaptation to 
the examiner’s signals and voice, the elimination of strain or anxiety, 
or more subtle devices utilized in keeping the digits in mind, in seeing 
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relations among them, in visualizing them or whatnot. Whatever they 
may be, the study indicates that they are rather unstable at least in the 
sense that all the skills and mneumonic devices acquired during the 
period of training were apparently lost during an equal period of disuse. 

Another fact also suggests that the techniques were instable. The 
tests in October, 1924 were given by a different—but skillful—experi- 
menter from the one who gave the earlier tests. As shown in Table 
III, the average scores of both groups in the last test were smaller 
than those obtained 4.5 months before although somewhat better than 
those of the initial tests 10 months earlier. The suggestion is that 
improvement brought about by practice consists in part of adaptations 
to the voice, mannerisms, and features of the test-technique of the 
examiner. 

RESULTS FROM THE SERIES OF TRANSFER TESTS 


As a further check upon the nature of improvement, the series of 
memory tests, given before practice in digits was begun, were repeated 
in October. 1924.! The results of these tests are shown in Table IV. 


TaBLeE I[V.—SHowING THE INITIAL (I) AND Frinau (F) Scores anp Gatns (G) 
IN SEVERAL Tests OF MEMORY FOR THE PRACTICE AND CONTROL GROUPS 

















Unre- Geomet- . 
Letters | lated | Related) sicat | Pictures| Tture 
inet words Geuses names 
ee 3.64 3.86 14.0 4.3 5.3 7.5 
Mia tic ie ooh. 4.11 3.86 16:2 4.9 6.8 9.0 
[ia sank Gases ss 0.47 0 2.2 0.6 1.5 1.5 
SS Bid cesksa nde 03 3.71 4.07 13.7 4.0 5.7 7.0 
ee 4.28 3.93 17.2 5.2 6.7 8.8 
| 0.57 |-—0.14 3.5 2 1.0 1.8 

Difference in favor of 

practice group...... —0.10 0.14 |— 1.2 | —0.6 0.5 | —0.3 
PE difference......... 0.30 0.25 1.8 0.4 0.6 0.7 























The differences in gains between the two groups are inappreciable; 
each group shows in about half of the tests, « slight but really unreli- 
able superiority such as would occur if the groups were substantially 
equal. There is, in other words, no evidence that the prolonged 


1 It was unfortunate that circumstances beyond our control made it impossible 
to give these tests in May, 1924. 
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training with digits has brought about any permanent improvement 
in immediate memory for other materials. 


RESULTS FROM A SECOND PERIOD OF PRACTICE, JAN.—APR., 1925 


After reviewing the results of the “retention tests,” 7.e., those 
given after the interval of disuse, the fact that they were not extensive 
led us to decide to give a longer series, a series which really amounted 
to another practice or “relearning”’ period. 

Eleven of the originally trained pupils, all that were available, were 
matched as nearly as possible in sex, age, mental age, intelligence 
quotient and memory for digits, with 12 others,' 8 of whom were 
members of the original control squad. A comparison of the two 
groups is given in Table V. 


TaBLE V.—AVERAGES FOR THE Two Groups IN JaANuaARY, 1925 





Age MA IQ Digits 





Original practice group.................... 6.34 8.1 128 4.73 
Unpracticed group............... Seite da 6.46 8.2 126 4.83 

















Beginning Jan. 27, 1925—13.5 months after the beginning of the 
study, 8 months after the end of the practice period and 3.5 months 
after the ‘“‘retention’’ tests—both squads of children were given 22 
days of practice in memory for digits somewhat more intensive than 
before. Each child was given 3 series of digits from lengths well 
within his grasp to a length on which he missed 2 or 3 times in 3 
trials. This procedure was repeated three times on each day, 7.e., 9 
series of each length. The scores used for securing the averages were 
the number of digits in the longest series on which the subject succeeded 
in two-thirds or more of the trials. These exercises were conducted 
by two experienced examiners—each taking half of each group—who 
had given none of the earlier tests to these pupils. 

The results, in terms of the average daily scores for the 22 practice 
days are shown in Table VI. 

Table VI shows essentially equal improvement for the two groups. 
There is no evidence that the 78 days of training which ended 8 months 
earljer had brought about any improvement in the fundamental capac- 
ities underlying memory for digits either in some direct way or, 
indirectly, by the stimulation of growth of these capacities. 





1 Twelve were used since in one case it was necessary to take the average of 
two children to yield a match with one of the trained children. 
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TaBLE VI.—AVERAGE Datity Scores IN Memory ror Diaits 







































































Day 1{2}3|4]}s5|e6|7]s8|o]1|n 12 
sia : me 
Practice group...... 4.73/4.82/5.00/4.82/4.73/4.82/4.91/5.00/4.82.5.27/5.27)5.18 
Unpracticed group. .|4.83/4.83|4.93/4.93/4.93 5.00|4..93 5.17/5.00,5.25)5.33|5.33 

| 

Day | 15 | 16 7 8 19 | 20 a1 | 22 png 

5 er Ss a A a 
Practice group...... 5.36 '5.27/5.45)5. 185.36 5.54/5.54/5.735.635.73 1.00 
Unpracticed group. . saa: es cece eh Ca 6.856. 921.90 1.09 
| fon eee ee 








SUMMARY AND CONCLUSIONS 


The main facts produced by the experiment are as follows: 

1. Practice in immediate memory for digits on each of 78 days 
during a period of 4.5 months by young children results in a marked 
gain in ability. 

2. A group of children of equal ability in memory for digits and in 
sex, age, mental age, IQ and in other forms of memory who were 
given no practice were, when tested at the end of the practice period, 
better than at the beginning but clearly inferior to the practice group. 

3. After 4.5 months of no practice, the two groups were again 
equal—the practice group had entirely lost its advantage. They 
were also equal in tests of immediate memory for other materials. 

4, After 3.5 more months without practice, both groups were 
given 22 days of intensive training at the end of which the 2 groups 
were still approximately equal. Neither in results (3) or (4) was 
evidence of any permanent effects of the 78 days of practice found. 

The facts, we interpret as follows: The improvement brought about 
by specific practice is due to the acquisition of special and subtle 
techniques of work, to adjustments to the test conditions, familiarity 
with digits—to acquired information and specific methods of attack. 
These special techniques and mneumonic aids seem to be unstable 
and transitory inasmuch as after 4.5 months of disuse they had 
disappeared. 

Since the effects of the intensive training are so evanescent and 
since no permanent results of any kind favoring the practice group 
were found, the conclusion suggested is that training, under the condi- 
tions of the study, produced no increase in the capacities which under- 
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lie the function either in some direct way, or indirectly, by means of 
the stimulation of growth. The improvement brought about by prac- 
tice seemed to be wholly in the form of devices, information, adjust- 
ments to the test conditions, ‘‘tricks of the task.” In suggesting these 
conclusions, the present study is in harmony with another, performed 
earlier, in which a motor function—speed of tapping—was utilized.! 

The demonstrated unstable and transitory character of improve- 
ment in memory for digits, indicates the desirability of obtaining, in 
other studies of similar skills, a measure of the permanence of the 
improvement brought about by practice. Are the well known 
increases in efficiency in memorizing poetry and prose, in solving verbal 
and mechanical problems, in rate of reading of several types which 
are usually rapidly produced by specific practice as evanescent as 
the improvement secured in this study? Or, is such instability pecu- 
liar to but few functions or is it a unique characteristic of the learning 
of very young children? The present literature, so far as we know, 
gives no satisfactory answer to these questions. 


The findings have, we think, a bearing on the general theories. 


concerning the natures and relations of native and acquired ability, of 
capacity and proficiency, of which the problem of the nature of intelli- 
gence isa part. By capacity may be meant the functional possibilities 
of the neural and other mechanism which make possible an ability 
that appears without special or intensive training; a capacity which, 
as disclosed in the studies, develops with time, with or without inten- 
sive practice. Upon these factors training has no appreciable effect. 

Proficiency normally depends in part upon capacity and partly 
upon techniques and information, adjustments to the task, etc. which 
may be acquired during practice and experience. No doubt the rela- 
tive degrees of these two components vary greatly among different 
functions; in memory for digits the importance of native capacity 
appears to be relatively large, partly because of the instability of the 
acquired factors. While the relations may be different in other 
functions, the distinction between capacity and proficiency may be 
usefully recognized. Similar investigations with other functions are 
needed to clarify the significance of native capacities and acquired 
abilities. The completion of similar investigations with a variety of 
mental function would contribute appreciably to our knowledge of 
the special problem of the nature of intelligence. 





1 Gates; A. I., and Taylor, Grace A.: Journal of Experimental Psychology 
(forthcoming). 
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AN AUTOMATIC MACHINE FOR MAKING 
MULTIPLE APTITUDE FORECASTS! 


CLARK L. HULL 
Dept. of Psychology, University of Wisconsin 


Madison, Wisconsin 


Early in 1923 the writer prepared for publication an article in which 
it was suggested as a somewhat Utopian speculation that some day a 
computing apparatus might be constructed which would solve multiple 
regression equations automatically, and thus yield quite mechanically 
and cheaply series of aptitude forecasts for purposes of vocational 
guidance.? Soon after, while working on the final details of the design 
for an automatic correlation calculator, the basic principles of such an 
aptitude forecasting device were hit upon. The similarities of the prin- 
ciples involved in the two machines were found to be such that the 
correlation machine as finally constructed served both purposes equally 
well. Thus what appeared at first sight to be a highly speculative 
conjecture was realized with quite unexpected promptitude. 

As suggested above, the aptitude forecaster performs its function by 
the automatic solution of multiple regression equations or of any 





1 The writer is indebted to the University of Wisconsin for part of the fund with 
which the costly first model of the machine was built and to the Committee on 
Scientific Problems of Human Migration, National Researsh Council, for a 
considerably larger part. He is indebted to Mr. O. E. Romare, Chief Mechanician, 
University of Wisconsin, and to Mr. Harold C. Kidder, Mechanic, for splendid 
cooperation and many valuable suggestions throughout the construction of the 
machine. 

The design of this machine was begun in February, 1921. The actual con- 
struction cf the machine was commenced in April, 1923. The model was exhibited 
at the Madison meeting of the American Psychological Association in December, 
1923. Work was continued on it during 1924. The machine was sufficiently 
perfected by “lie summer of that year to solve multiple regression equations auto- 
matically as well as do practical correlation work on a large scale. In December, 
1924, the machine was demonstrated before the Washington meeting of the Ameri- 
can Psychological Association both as an automatic solver of multiple regression 
equations and as an automatic correlation calculating machine. 

2 The Joint Yield of Teams of Tests. Journal Educational Psychology, October, 
1923, p. 405. 

An automatic Correlation Calculating Machine. Journal of the American 
Statistical Association, December, 1925. This article describes some of the more 
technicalaspects of the machine in connection with its use in computing means, 
standard deviations, and correlation coefficients. The article includes two plates. 
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similar forecasting formulae.! The regression equation must of course 
be worked out previously, and the test scores be secured, before the 
machine can make the forecast. Unfortunately the derivation of a 
multiple regression equation for a good-sized battery of tests is a 
research project of some magnitude and one which requires a certain 
amount of technical knowledge and skill. Once derived, however, 
the principle of its use is simple. 

The action of the machine can best be explained after the consider- 
ation of a concrete aptitude forecasting situation. A high school 
freshman in a suburb of Milwaukee made the following scores on a 
battery of five tests: 


SYMBOL IN 

Test Score Equation 
ai cces drier ads Seeded shee h eos cce censcenake? 8 Xi 
ee ues aeinln weeds bbe sos dmed odes weeb 4 Xe 
A BSS 8 Or es clan alata! da ain avacein Walaeae 10 Xs 
6h inchs Copel pane ar ntkhre de Vier wend cnae’ 55 X4 
eb Mies Saba e ebm digs aeae ea eae sae ak ead Eke 45 Xs 


The particular battery had been assembled for the purpose of predict- 
ing aptitude in high school algebra and the following formula had 
been derived, into which the scores of any subject might be substituted 
and the forecast be secured.? 


Predicted mark in algebra = .69X;, + .25X2 + .36X3 + 41X,+ 


A5Xs + 16.5 
Substituting the above set of test scores, 


Predicted mark in algebra = .69 X 8 + .25* 4+ .36 X 10+ 
| Al X 55+ .45 X 45 + 16.5 
Sowing, 
Predicted mark in algebra = 69.4 


The regression equation thus indicates that this particular student is 
likely to be pretty weak in algebra, hardly doing a passing grade of 
work. Asa matter of fact the student “passed” at the end of the year 
with a grade of 71. This is a closer prediction than forecasts from 
the above tests and multiple regression equation will average. In the 
same way the test scores of any other such student who has taken the 
test under standard conditions, may be substituted in the same equa- 





1 Prediction Formulae for Teams of Aptitude Tests. Journal of Applied 
Psychology, 1923, Vol. VII, pp. 277-284. 

2 This equation was worked out by Miss Margaret V. Kelin under the writer’s 
direction. 
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tion and there will result an estimate or forecast of his particular 
aptitude in algebra. 

When the operation of forecasting is performed by the machine, 
the regression equation is placed in the machine at one place and the 
test scores are placed in it at another. The machine then proceeds 
automatically to multiply each test score by its own proper weight, 
combining the products as it goes along. When the computation is 
completed the machine stops automatically and the forecast may be 
read off at the convenience of the operator. The test scores and the 
regression equation are both given to the machine in the form of per- 
forated records. These are four inches wide and somewhat resemble 
music roll in appearance. In the case of the test scores the record 
is made of tough Kraft paper. The perforations are made in the paper 
with precision by a special recording device which somewhat resembles 
a simplified typewriter. With this recorder a set of test scores such 
as given above can be transferred to the paper tape in a few seconds. 
In the case of the regression equation the perforated record may also 
be of paper. Since, however, the same regression equation is likely 


to be used over and over again in combination with the test scores | 


from any number of individuals, the equation in such cases should 
probably be recorded on a more permanent material such as a thin 
metal band. 

In recording regression equations the signs of the weights or 
coefficients are disregarded since the present form of the machine adds 
but does not subtract. Equations involving negative signs are accord- 
ingly transformed for machine use in such a way as to permit of solution 
without subtraction. 

Partly for purposes of simplicity in exposition, the s bove account 
has pictured the machine as making single aptitude predictions by 
solving isolated regression equations. Asa matter of fact, the machine 
has really been designed as an integral part of a comprehensive 
system of vocational prognosis in which the machine will solve in 
immediate succession a large number of different equations each yield- 
ing a forecast for a different vocational aptitude, all equations being 
based upon one and the same battery of tests. The detailed nature of 
the system together with the methods by which it is proposed to realize 
it will be described in a future article. Since the proposed system 
differs radically from the prevailing methods, a brief sketch of it will be 
necessary at this time in order that the potentialities of the machine 


may appear. 
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The proposed system is designed primarily to furnish youths 
seeking light in the choice of a life work, as much assistance as applied 
psychology can give. The youth wishes to know which of all the voca- 
tions in the world is most likely to yield him greatest success. This 
can only be determined by knowing in some sense or other his relative 
aptitude on each. To forecast his aptitude on all possible vocations 
is obviously out of the question. Fortunately certain groups of voca- 
tions are very similar. The focal or central occupation of such a 
group of vocations may be called a type aptitude. It is assumed for 
purposes of aptitude prognosis that all important vocations can be 
grouped into about 40 type aptitudes or occupations. It is proposed, 
then, to furnish him in need of vocational guidance, a prediction of 
his potential success on each of the 40 type occupations. Forecasts 
will all be made in terms of a single uniform scale so as to permit ready 
and direct comparisons of the various potentialities. Within the prog- 
nostic limits of the system, the youth can then tell in what lines of 
activity he will probably be weak and can therefore avoid them. He 
can also see in what lines he has special strength. After making a 
study of these latter vocations in the light of his interests, tastes and 
opportunities, the youth will be able to choose his life work with a 
degree of intelligence now unknown. And lastly, but by no means of 
least importance, the cost of the service to the individual or society 
must be moderate—probably measured in cents rather than dollars. 

It is evident that by methods such as are at present developing, 
such a service as is proposed would be an economic impossibility. 
In the first place there must be considered the fact, so generally neg- 
lected, that the forecasting efficiency of test batteries will probably 
never exceed 40 per cent:! On the other hand must be considered the 
cost. Allowing 2 hours for giving each battery, it would require some- 
thing like 10 days of each subject’s time to take the 40 batteries of tests, 
and the same amount of| time by a trained psychologist to give them. 
To this must be added several days more of labor for scoring such a 
mass of test results for each subject and for working up the scores into 





1 The forecasting efficiency of a test or battery is given by the formula: 
E=1-vV1-?r 
A battery yielding a correlation of .80 with a criterion is shown by this formula 
to be only 40 per cent efficient. At present the best of the aptitude batteries 
range around .65 or .70. A correlation of .65 corresponds to efficiency of 24 
per cent. This subject is discussed more fully in this Journal, February, 1925, 
pp. 78-81. 
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the various aptitude forecasts. It is thus apparent that aptitude test- 
ing as now developing is inconceivable as a practical working 
possibility. 

Fortunately there is a way out of the difficulty. In a series of 40 
aptitude test batteries there is certain to be an enormous amount of 
duplication of tests. Few capacities required for success in any given 
vocation would not be involved to some, though hardly the same, 
degree in at least a few of the other 39 vocations. Doubtless also 
some capacities contribute, though in varying degrees, to success in 
nearly all vocations. All this undoubted duplication of tests should 
accordingly be eliminated. It is doubtful whether there would then 
remain more than 30 or 40 distinct test-capacities as contributing in 
important amounts to success in the main type occupations. This 
limited number of essential tests then could easily be organized into a 
single battery which would perhaps require only 4 or 5 hours to give. 
At any rate there would be an enormous reduction in the labor 
involved. Thus at once would be eliminated the greatest difficulty. 

The test scores from the single universal battery of tests thus 
obtained, must next be combined in various groupings according to 
the various combinations of capacities required for success in each of 
the 40 type aptitudes and according to the quantitative importance of 
each trait in each particular combination. In other words, 40 very 
long multiple regression. equations, probably involving a score or more 
of items each, must be solved to obtain the aptitude forecasts from the 
test scores. This will involve the multiplication of some 800 pairs of 
an one-, two-, and three-place numbers, together with the summation of 
equal number of products. Such an amount of hand work for each 
subject even with the best modern aids, is formidable to say the least. 
It was to perform these computations automatically that the machine 
described above was designed. 

In such wholesale aptitude forecasting as proposed, the procedure 
will be as follows: The 40 multiple regression equations will first be 
placed, one following the other, on a single permanent band of thin 
metal. In this form the equations will be placed permanently in the 
machine and need never be given any further attention. The test 
scores from the universal battery of tests will be recorded on a paper 
tape as described above. This recording will require about two 
minutes. The tape, about 18 inches long, will be cemented together 
at the ends to form a small circular band. This band will then be 
placed in the machine and the starter pressed. The machine will 
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then proceed automatically to compute in succession each of the 40 
aptitude forecasts required, the test band making one revolution for 
each forecast made. 

As at present constructed, the machine stops automatically when 
each forecast is completed, permitting an operator to record the 
result before proceeding to the next forecast. So extensive will be 
the computations involved in such wholesale forecasting as proposed, 
even an automatic machine will probably require an hour or so to 
perform the calculations for one subject. Partly to eliminate the 
expense of an operator to continually copy off these forecasts, but even 
more to avoid errors in reading and copying, there has been designed 
an attachment for the machine which will record the forecasts auto- 
matically as they are made. By means of this device the machine, 
once the starter is pressed, will proceed without any attention whatever 
from the operator to make the entire 40 forecasts without interruption, 
printing each down as completed on a specially designed blank card 
bearing the subject’s name. At the conclusion of the task the machine 
will stop automatically, at the same time ringing a bell to call the 
attendant. The card, when removed from the machine, will present 
in orderly array, each opposite the name of the vocation in question 
and in units of a single uniform scale permitting of instant comparison, 
scientific forecasts for an individual in all the chief type-occupations 
of the world. 
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A DISCUSSION OF THE QUOTIENT METHOD OF 
SPECIFYING TEST RESULTS 


GERTRUDE RAND 
Bryn Mawr College 


The validity of the quotient or ratio method of specifying test 
results has been a subject of discussion since the method was made an 
integral part of the Stanford Revision of the Binet-Simon test. Ter- 
man had shown that for the groups selected for the standardization of 
the revision, 80 to 120 cases at each age, the distribution of intelli- 
gence quotients for the different ages was fairly symmetrical and that 
“the range, including the middle 50 per cent of IQ’s was found prac- 
tically constant from 5 to 15 years. The tendency is for the middle 50 
per cent to fall (approximately) between 93 and 108.”* Elsewhere? 
he gives the range of quotients including the middle 50 per cent of 
cases for 2 years combined as follows: ages 5 and 6, 15 IQ points; 
ages 7 and 8, 16+ IQ points; ages 9 and 10, 15 IQ points; ages 11 and 
12, 17 IQ points; and ages 13 and 14, 16 IQ points. The interquartile 
range of months of mental age at the various chronological ages* 
when converted into range of IQ gives the following slightly different 
figures: for age 6, 13.6 IQ points; age 8, 13.8 IQ points; age 10, 13.9 IQ 
points; age 12, 14.2 IQ points; and age 14, 16 IQ points.! The similar- 
ity of the distributions and of the middle 50 per cent range of intelli- 
gence quotients at the various ages, together with the close agree- 
ment that was found to exist between the distribution of quotients for 
the combined ages and theoretically “normal” distribution, have 
tended to give to the Stanford-Binet IQ a unit significance which has 
served as a basis for classification at all ages. On the strength of the 
above rather scant experimental verification, the unit significance of 
this IQ scale of values has been rather generally accepted to hold for 
ages 5 to 14, and subsequent experimental interest has centered on 
the question of the validity of the method for prediction of future 
intellectual status as shown by retest after an interval of time. That 
two points are involved in the constancy of the IQ does not seem 
always to have been clearly recognized; that is, the quotients not only 
for the median and perhaps quartile values but over their entire range 





*T have obtained these values by dividing the interquartile range of months 
of mental age by the median mental age. For the median mental ages, see the 
frequency distributions of IQ’s, pp. 34-37 of the article referred to above. 
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from high to low must be shown to have the same significance at the 
different, ages; and a given individual must have maintained his intel- 
lectual status during the interim between the tests. Evidence 
obtained from retests alone does not distinguish in case of a change 
in the IQ which of the above has been the cause of the variation. 

During the past few years a rapid extension has been made of the 
quotient method to other intelligence tests, both individual and group; 
to tests of broader abilities than intelligence, such as the Porteus 
Maze and other performance tests; and to educational tests. More 
recently the method has also been used to furnish an index of achieve- 
ment. These extensions have given us besides the IQ 


( mental age _) 

mental age norm for chronological age/’ 

the coefficient of intelligence or CI 

point score 

lease score norm po! ——— age 

educational age , 

EQ hepa age norm for chronological met and the achieve- 
educational quotient 


ment or accomplishment quotient or AQ Setelilennes quotient’ or 








); the educational quotient or 








educational age 
mental age 


of the method inevitably raise the question of the comparability of 
the quotients obtained by different tests. Again two points are 
involved. Various tests, even the various so-called intelligence tests, 
may not gauge the same abilities. Whether or not they do gauge 
the same abilities would be indicated by the degree of correlation found 
to exist between the different tests. But of greater importance is the 
fact that before a comparison can be made of the scores given by differ- 
ent tests, these scores must be shown to have the same quantitative 
significance over their entire range from high to low. That is, in the 
absence of an a priori unit of mental measurement in terms of which 
to rate the different capacities of an individual or the capacities of 
different individuals, a point-by-point calibration must be made for 
the scores assigned by each test to render them in any sense compar- 
able. The procedure for the various tests has been, however, to 
assign the quotient value of 100 to the median performance at each 
age, thus establishing the equivalence of the mid point in each scale; 
but there has been little and most often no attempt to show whether 
or not the values above and below 100 are equivalent (a) for the 





when the chronological age is the same. Such uses 
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different ages of any one test, and (b) for the different tests. In 
general it has been taken for granted without demonstration and 
without any recognition of the need for demonstration that the entire 
range of quotients, for the intelligence tests at least, has the same 
quantitative significance as the Stanford-Binet IQ. 

In view, then, of the extended use of the quotient as an index of 
intellectual, educational and achievement status and of its steadily 
increasing sphere of application, it may not be amiss to add one more 
paper in protest against the method and to make a plea for the general 
adoption of a more consistent unit for the specification and quantita- 
tive treatment of test results. The standard measure method‘ 
furnishes such a unit. It involves a more elaborate calibration of 
the test, to be sure, but with the assumption as a working hypothesis 
that abilities are normally distributed, it provides a unit which is 
comparable for all ages and all tests. The quotient method on the 
other hand furnishes a unit which has only an apparent quantitative 
significance, the illusory character of which is too easily lost sight of 
and ignored by the many who use the quotient as a number without 
having a comprehension of the underlying theoretical assumptions. 

In this paper then the attempt is made to restate briefly the theo- 
retical assumptions underlying the quotient method and to show how 
these have been met in practice by the various quotients in vogue. 


THEORETICAL ASSUMPTIONS UNDERLYING THE QUOTIENT METHOD 


The conditions that must be fulfilled if the quotient method is to 
be considered a valid device for providing comparable measures have at 
various times been stated by Terman, Otis, Freeman and Kelley. 
These statements have taken various forms but their import has been 
the same. Kelley for example says: ‘‘The minimum number of condi- 
tions which must be met before two scales can be fully equated are 
three. The conditions are (a) one point of the first (scale) must be 
known to be equal to one point of the second, (6) a second point of the 
first must be known to be equal to a second point of the second, and 
(c) the law establishing the relationship between successive points on 
the first must be known to be the law underlying the second.’’> If 
then we can show that these three conditions are met in case of the 
quotients obtained for the various ages by any one test, the quotient 
method furnishes a comparable measure for these ages; and if these 
three ce~~..ivns are satisfied in case of different tests we are again 
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provided with a comparable measure for these tests whether they be 
of intellectual, educational, or other abilities. Now in the quotient 
method the value of 100 is always assigned to the median performance 
at each age and for each test, thus fulfilling condition (a). If either 
the quartile deviation or the standard deviation of quotient can be 
shown to be the same at the different ages or for the different tests, a 
second point on the various scales is equated and condition (6) is 
satisfied. In other words the fulfilling of these two conditions results 
in the establishment of a unit of the scale in terms of the range between 
the median and either the quartile or the standard deviation, or a 
unit in terms of the absolute variability of the distribution. The 
third condition can never be proved to have been satisfied, but as a 
working hypothesis the assumption must be made that the relationship 
between the successive points on the scales for different ages and for 
different tests is the relationship of ‘‘normal’’ or some known form of 
distribution if our work of comparing the various abilities of individuals 
is to proceed at all. 

Freeman’s discussion centers around the validity of the IQ as a 
device for indexing intellectual status at various ages by any one test, 
and he compares the situation for various tests in this regard. A 
comparison of the quotients obtained for different tests was not con- 
sidered in his discussion. Freeman states the conditions which are 
to be fulfilled by a valid quotient unit in terms of the form of the 
growth curves rather than of constancy of variability of quotient at 
all ages. The principle involved is, however, the same as in Kelley’s 
discussion. That is, Freeman shows that if IQ’s of all magnitudes 
have the saine significance at various ages for a given test, the curves 
of growth obtained for the different levels of intelligence must diverge 
from the line of median growth in strict proportion to the increase in 
chronological age. For example, if we compare the growth curves 
for the 90, 75, 25 and 10 percentile of each age with that for the 50 
percentile, the advance and retardation from the median rated in 
mental years must be shown to increase in proportion to the increase in 
chronological age. Now when this occurs it isobvious that the quartile 
deviation of mental age increases in proportion to the increase in chro- 
nological age and, since by the method the mean mental age equals the 


chronological age, that the relative variability of mental age ett 


remains constant at all ages. This is true then also of the standard 
deviation or o of mental age and the coefficient of variability based 
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ong. It follows from this that the QD and o of IQ are also constant 


° e QDua a Oma = 
at the different ages since — * QD,, and fan "ow 


Freeman then states the requirements for a valid IQ in terms of 
increasing variability of mental age from the mean so that constancy 
of relative variability of mental age is maintained at all ages; Kelley, 
in terms of constancy of absolute variability of IQ at all ages. 

Applied to scales where the results are expressed in points, the same 
principles should hold. In order that the coefficient of variability 
can be considered a comparable measure at various ages for any test, 
it must be shown that the quartile or the standard deviation of points 
increases relatively to the increase in the norm (the median) in points 
for each age, in which case the quartile or the standard deviation of 
coefficient of intelligence also is constant for each age. And in case 
of different scales if it is desired to compare, average, or divide the 
quotients obtained or to use the same interpretation of quotient values 
throughout the entire range from high to low, it must again be shown 
that the absolute variability of the quotient distribution is the same 
not only for all ages within a test, but that it is the same between tests. 





Sig AND Oc 


Freeman has published curves showing the median and other 
percentile scores at different ages for some tests scored in points. He 
noted in these cases that the growth curves for the different intelligence 
levels do not diverge from the median curve in proportion to the 
increase in chronological age but are approximately parallel to it. I 
wish to supplement Freeman’s data with the evidence which I have 
been able to collect concerning the standard deviation of the quotient 
(a) for tests scored in mental age or in points converted into mental 
age, and (b) for tests scored in points. Thisisshownin TableI. Two 
lines of comparison are indicated, first, between the quotient variabili- 
ties at different ages for one test (the columns of the table); and second, 
between the quotient variabilities of different tests at any age (the 
rows of the table). The data as regards the various tests in use today 
are incomplete in large part because in many cases only the median 
scores for the various ages are given and the variabilities of the dis- 
tributions or the percentile scores from which the variabilities could 
be computed are omitted. I make no claim however to have exhausted 
the available data on these points. Any omissions are due to lack 
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of time to search the voluminous literature, not to an arbitrary selec- 
tion of cases. I have not been able to find anything on the compara- 
tive variability of the educational quotient at different ages or for 
different scales. The following statement by Burt,’ “Since the stand- 
ard deviation of a normal age group varies in almost direct proportion 
to the group’s average age, the statistical requirements of the educa- 
tional deviation are largely fulfilled by the educational ratio,’’ was 
based not on data derived from educational tests, but on teachers’ 
estimates of educational attainments in terms of grades and standards 
as defined by the Board of Education codes.’ In the few educational 
tests where age norms are supplied, variabilities and percentiles for 
the different ages are not given. Variabilities are frequently cited 
for the various school grades but these do not furnish the comparative 
data desired because of the selection of cases which occurs in the upper 
grades and the wide range of chronological age within a grade. 

In Table I the o’s of intelligence quotient or the o’s of coefficient of 
intelligence are given for various ages of the following tests: Stanford- 
Binet;* Burt’s Revision of the Binet;? Dearborn Group Test of Intel- 
ligence Series II; Healy Pictorial Completion Test II;!1 Army Alpha, 
Army norms and Kohs-Proctor norms;'? Porteus Mazes;'!* Northum- 
berland Mental Tests;!4 Pressey Primer Scale (from norms supplied 
by the author); Pressey X-O Scale;'* Kingsbury Primary Group 
Intelligence Scale;'* and Pintner Non-language Mental Tests.” In 
the starred cases c has been computed from the QD assuming normal 
distribution of scores; in cases marked with an obelisk, from the 
30 percentile deviation. When the computation of the o,. has 
been made by me from data supplied, the formula used has been 
oy,/ Median yx = or: 

In some of the cases listed j in the table, for example the Pictorial 
Completion, the Pintner and the Pressey tests, the authors recommend 
percentile—not quotient specification. I have, however, determined 
the o’s of quotient for these cases also in order to offer further evidence 
concerning the general inapplicability of the quotient as a method of 
specifying test scores for purposes of comparison, classification or 
prediction. 

Table I brings out the following points. 

1. oi is more constant at different ages for the Stanford-Binet 
than for any other test. It varies for this test from 9.7 to 11.9. In 
Burt’s revision it varies from 8 to 15; in the Dearborn Group test from 
10 to 21; in the Pictorial Completion test from 19 to 38; in the Army 
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TaBLE I.—SHOWING THE STANDARD DEVIATION OF INTELLIGENCE QUOTIENT OR 
or CogEFFICIENT OF INTELLIGENCE AT DIFFERENT AGES FOR VARIOUS TESTS 





Standard deviation of intelligence quotient 
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* Computed from QD, assuming normal distribution. 
+ Computed from 30 percentile deviation, assuming norma! distribution. 


Age 12, oiq Stanford-Binet, 17.0; Herring Revision Binet, 17.1 
Age 7, o1q Stanford-Binet, 13.5; Porteus Maze Tests, 30 
High School Freshman, o;q Stanford-Binet, 12.6 
Army Alpha, 14.7 

Terman Group Test, 13 
Illinois General Intelligence, 15.5 
Haggerty Delta 2, 19 
Miller Mental Ability Form A, 22.9 
Miller Mental Ability Form B, 18.9 
Pressey Senior Classification, 16.6 
Otis Self-administering Higher A, 10.8 
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Alpha from 8.3 to 9.6, Army norms, and 13.8 to 15.5, Kohs-Proctor 
norms. * 

2. o:q Varies in magnitude for different tests, apparently being larger 
for tests of the non-verbal and performance type. This wider spread 
of performance ability was found also by Kohs in the Block Design 
test. The quotients for these various tests are therefore less com- 
parable the more widely they depart from the median value. 

3. dig 18 on the whole more constant for the different ages than 
iS ¢,. The deviation from constancy in the CI is so marked in each 
test for which these data were obtainable that this quotient could 
have no validity whatever if applied for the purposes of specification 
of the performance of children of different ages. For example, in 
the Pressey Primer test it ranges from 24 to 50; in the Pressey X-O 
from 22 to 36; in the Kingsbury Primary scale from 34 to 76; and in 
the Pintner Non-language test from 15 to 73. 

The figures given in Table I present material for a comparison of 
the o,. for only a narrow range of tests. I have listed below the table, 
therefore, additional data for other tests given to specified groups of 
subjects which were not in all cases sufficiently large or unselected 
to be considered representative. The o’s nevertheless afford some 
comparison of the scatter of scores for the tests in question for the 
particular groups. These comparisons are for the Stanford and 
Herring Revisions of the Binet test;!® the Stanford-Binet and Porteus 
Maze tests;”° the Stanford-Binet and eight group tests of intelligence ;?" 
and the Stanford-Binet and the Pintner-Patterson Performance 
Scale. ?? 

If the quotient method were used in these various cases, the 
equivalent to a Stanford-Binet IQ of 90 for a 7-year old child would 
be 86 on the Burt Revision, about 70 on the Porteus Mazes, 80 on the 
Pintner-Patterson Performance Scale, and 81 on the Pictorial Com- 
pletion test; the equivalent to a Stanford-Binet IQ of 113 for a High 
School Freshman would be 113 also on the Terman Group test, 123 on 
the Miller, 119 on the Haggerty, 111 on the Otis, etc. Further, these 
relative quotients are not constant at other ages. Since then the 





* The figures for the Army Alpha test were computed from the middle 50 per 
cent range given by Van Wagenen from results obtained from several hundred 
school children in grades V to VIII. The median mental age for each of the 
chronological age groups is very high on both the Army and the Kohs-Proctor 
norms. For this reason the o’s given in the table for this test may not be repres- 
sentative of what would have been given by more unselected groups. 
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IQ values above and below 100 have such different meanings for 
different tests, there is little wonder that on this score alone Gates 
should have found a wide range of IQ’s for an individual when tested 
by different group tests; and that, if the Stanford-Binet classification 
is applied to all quotients alike, he should have found, as he says he 
does, that a pupil is ‘‘classified as average (by one test and) by another 
a genius.’ The conclusion is inevitable that if the quotient method 
is to continue to serve for specification and classification of scores 
obtained by various tests, some scale of values must be adopted as 
standard and all quotients made to conform to this scale by a process 
of calibration, perhaps by the simple graphic device proposed by 
Miller.24_ This calibration would, moreover, have to be made for the 
different ages of each test as well as for the different tests. 


A CoMPARISON OF THE VARIABILITY OF THE IQ AND THE CI FOR THE 
SAME TEsT 


Dearborn has presented data for his advanced group test of intelli- 
gence from which an interesting comparison of the variability of the 
IQ and the CI at different ages can be made when both sets of quotients 
are computed from the same material. The Dearborn test is scored in 
points which are converted into mental age by the aid of a table of 
age standards.* Elsewhere Dearborn has given the QD of IQ for 
ages 7 to 18. From this, assuming symmetrical distribution I have 
computed the mental age of the 75 and 25 percentile case at each age 
and working backwards, have secured the score in points for these 
cases from the table of norms. This furnishes the data needed to 
compute the QD of CI for the various ages. The QD’s for both 
quotients for ages 7 to 13 are given in Table II. As is seen in the last 
two columns of the table the variability of quotient from year to year 
is much more uniform when the data are expressed in mental age than 
when expressed in points. The 25 percentile of IQ, for example, ages 
7 to 13, is respectively, 93, 90, 90, 88, 89, 87 and 86; the 25 percentile 
of CI for these ages is 25, 52, 65, 72, 75,72 and 71. It is obvious that 
the former quotient provides a more consistent unit for specifying 
the results of children of different ages than does the latter. 


SoME CAUSES OF THE FLUCTUATION OF THE IQ ON RETEST 


As stated earlier, the experimental evidence on the validity of the 
Standford-Binet IQ has taken the form of retest data. It may not be 
out of place to list here the factors that may cause a fluctuation of the 
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TasLe II.—SHowine ror CoMPARISON THE AGE VARIABILITY OF IQ aNnpD CI 
WHEN Bors QuoTients ARE COMPUTED FROM THE SAME MATERIAL 



































Age _ | Percentile |Mental age} Points IQ | CI QDie QD. 
75 7-6 14 107 175 
7 50 7-0 8 100 100 7 75 
25 6-6 2 93 25 
75 8-10 29 110 145 
s 50 8-0 20 100 100 10 47.5 
25 7-2 10 90 50 
75 9-11 42 110 135 
9 50 9-0 31 100 100 10 35 
25 8-1 20 90 65 
| 
75 11-0 55 112 128 | 
10 50 10-0 43 100 100 | 12 28 
25 9-0 31 ss | 72 
75 12-2 69 111 125 
11 50 11-0 55 100 100 11 25 
25 9-10 41 89 75 
75 13-6 | 85 113 127 
12 50 12-0 67 100 100 13 27.5 
25 10-5 | 48 87 | 72 | 
| } 
75 14-10 | 101 114 | 128 | 
13 50 s.i 2 100 | 100 14 28.5 
25 11-2 | 56 86 | 71 








IQ on retest. Some of these factors are inherent in the test; some are 
extraneous. Illustrative reference will be made Chart 1 in which are 
given some typical results of retesting that have accumulated in our 
records. 

1. A possible lack of equivalence of the quotient unit at different 
ages. 

2. An actual change in intelligence. 

3. A change in tested intelligence due to change of either environ- 
ment or schooling. The quantitative effect of these factors is not 
known but that they exist seems unquestionable. Mostof uswho have 
tested young chiidren before and after they have received school 
training have noted the increase in IQ that frequently occurs with the 
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second test. This type of increase will be noted in the IQ curves 
given in Chart 1 for L. T., R. T.,and D. P. An increase in IQ after a 
child has been transferred from the ordinary to the M. D. school has 
been noted in England by Burt” and is seen in the case of D. L. after 
his transfer from the regular school class to a special class for backward 
children. This case is discussed in detail below in another connection. 
The variable effect of change of environment on tested intelligence is 
seen in the curves for P:, P2, P3,Ps,and Ps. These represent children 
of one family who were tested at the request of a social service organ- 
ization. Up to the time of the first test the family had lived in extreme 
poverty and in a most unfavorable environment. After the test its 
living conditions were greatly improved and were supervised by this 
organization. Retest after 18 months showed the following: 

In the cases of two of the children, the oldest P;, and the next to 
the youngest P,, there was no change in the quotient on retest. The 
two tests gave for the one child 53 and 53; for the other 50 and 49. 
In the cases of two of the children, the third oldest P; and the youngest 
P,, there was an increase in the quotient, the former showing a gain of 
4 points, from 57 to 61, the latter of 10 points, from 49 to 59. In the 
case of the remaining child, the second oldest P,, there was no change 
whatever in the mental age at the second test and consequently a 
lowered quotient of 7 points, from 63 to 56. This child, who was 
below 12 at the time of the first test, will be cited below in the discus- 
sion of the effect of mental arrest on the IQ. 

4. A change in the tested intelligence due to fluctuation of ability, 
interest or attention, or due to the variable response of an unstable or 
psychopathic personality.” 

5. The unreliability of the test as measured by the PE of the score. 
This is estimated by Otis and Knollin to be equal to not more than 
3.5 IQ points at ages 7 and 14 in 50 per cent of cases.” 

6. The discrete or discontinuous character of the mental age scores. 
Below year 12 these are with one exception in 2-month steps. The 
IQ, however, is apparently continuous, Actually it is unequally 
discontinuous for the different ages. At mental age 6, for example, 
the IQ steps are 3 points apart; at 12, 1.5 points apart.” 

7. Marginal passes and failures.*° 

8. Errors in giving and scoring the test. 

9. The onset of mental arrest. This is a factor for which it is 
very difficult to allow. In Chart 1 are given some mental age and IQ 
curves for children retested under my direction. Among these are 
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found cases which show no gain in mental age after the chronological 
age of 10 (R. C.); others after 11 or earlier (M. H. and J. W.) and 
after 12 or earlier (A. W. and P,). On the other hand cases of the 
same or lower order of intelligence showed no arrest at 15 (F. K. and 
P;), and one case, cited below, having an IQ of 60, is not yet arrested 
at 16.5 (D.L.). The lowered IQ in the arrested cases is shown at the 
right of the chart. In any case of lowered IQ after the age of 10 the 
possibility of mental arrest as the cause must be considered. 

10. The effect of the magnitude of the chronological age on a varia- 
tion in the IQ. In this connection I wish to cite an unusual case of 
late development (D. L.) that was retested under my direction at year 
intervals for 5 successive years where the mental age increased on an 
average of 80 per cent per year but the quotient, because of the high 
chronological age, showed no significant increase. The case is that 
of a negro boy who was first tested in 1920 at the age of 12 years 8 
months and scored a mental age of 7 years 4 months, 1Q58. The boy 
had been in school 6 years and was at the time in the second grade. 
The following school term he was placed in a special class for backward 
children which had just been formed in the school. He was retested 
in 1921, 1922, 1923 and 1924. At the time of the last test he was still 
in the special class but had shown considerable improvement. The 
results of the retests are given in Table III. I am unable to say how 


TaBLeE III.—Rerests or D. L. 














Date of Chronological Mental Intelligence | MA Increase 
test age age quotient CA Increase 

4/20/20 | 12 years-8 months; 7 years—4 months 58 

6/1/21 13 years-9 months | 8 years-4 months 61 .93 

5/22/22 | 14 years-9 months; 9 years-2 months 62 .83 

4/26/23 | 15 years-8 months; 9 years-8 months 60 54 

3/7/24 16 years-6 months | 10 years-5 months | 63 .90 

















much of the increase in mental age was due to practice effect or how 
much to the effect of adequate schooling for his intelligence. The 
point I wish to make here is that the significant increase in the boy’s 
mental age is obscured by the quotient method of specification because 
of the magnitude of the denomination of the fraction, 7.e., his chrono- 
logical age. The mental age increased on an average 10 months per 
year through the range of chronological age of 12 years 8 months to 16 
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years 6 months, that is, the boy was progressing along the 83 instead 
of the 60 IQ level, yet the average gain in quotient during this period 
was only 1 point. Had the boy been half as old this annual rate of 
increase in mental age would have changed the quotient approximately 
twice as much. The annual quotient rate of increase is shown in the 
last column of the table. 


THE ACHIEVEMENT QUOTIENT 


The AQ is the latest recruit to the army of quotients. It is claimed 
that by dividing the EQ by the IQ, or the EA by the MA when the 
CA is constant, an index of achievement is furnished. I am omitting 
from this discussion any mention of the unreliability of the AQ due 
to the unreliability of both the numerator and the denominator of 
the fraction together with certain other criticisms of the concept which 
have been ably discussed among others by Chapman, Toops and 
Symonds, Kelley and Ruch. I wish to limit myself here to a criticism 
naturally raised by the preceding discussion of the quotient technique, 
that is, a criticism of the procedure of dividing one unit by another 
which has not been shown either logically or empirically to be the 
equivalent of that unit. We are early taught that we must not 
divide months by years, grams by ounces, centimeters by inches. 
Why, then, should we divide EQ’s by IQ’s or EA’s by MA’s without 
proof of their equivalence at other points than at the median?* If, 
for example, it should prove that the EQ is a smaller unit than the 
IQ, then for cases falling increasingly above the median the EQ would 
be increasingly smaller than the corresponding IQ; but for cases falling 
increasingly below the median, the EQ would be increasingly larger 
than the corresponding IQ. The effect of this on the AQ is the oppo- 
site in the two cases. Suppose thec,, = 10 and theo, = 15. Then 
the child who is lo above the median in both intelligence and educa- 
tional attainments whose AQ should, of course, be 100, will be rated 
as 110/115 or 96; and the child who is 2c above the median in both 
ratings, whose AQ also should be 100, will be rated as 120/130 or 92. 
On the other hand the child who is lo below the median in both intel- 
ligence and educational attainments is rated as 90/85 or 106 instead 





* The usual point of view has been that expressed by Stebbins and Pechstein 
in an article entitled ‘‘Quotients I, Eand A.” Having assigned to each child an 
MA and an EA rating, they write: ‘‘ With our material now in terms of comparable 
units, we were able to combine the mental ages and educational ages in one meas- 
ure—the Accomplishment Quotient.’ 4 
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of 100, and the child who is 2c below the median in both ratings as 
80/70 or 114. In other words, from this factor alone the brighter 
the child the lower will be his AQ even if there is no actual difference 
in his intellectual and educational status; and the duller the child, 
the higher will be his AQ. 

I have the following evidence to offer that the EQ unit is smaller 
than the IQ unit. 

1. Burt says: “Individuals vary distinctly more in intelligence 
than they do in educational ability—in effect about a quarter as much 
again.’’** Another citation: “the general range of deviation is for 
educational attainments much narrower than for intelligence.’’** 

2. Ruch, in a comparison of the AQ’s given by various educational 
test batteries when teamed with the Stanford-Binet intelligence test, 
obtained for 64 children, Grades V to VIII, the following standard 
deviations: ¢,, (Stanford-Binet) 14.2; o,, (Stanford Achievement 
Test Form A) 10.4; o,,. (Lippincott-Chapman Products Survey) 12; 
Orq (Illinois examination) 16.3. With the exception of the Illinois 
Examination which is a very short test and was intended to be teamed 
with the Illinois Intelligence Examination, the o,, is for this group 
of children smaller than the o,,..*4 

3. The Jersey Composite Test results are expressed in point scores 
for which percentile norms are provided for Grade V for both the 
intelligence and the educational test series. It is possible for this 
composite test, then, to compare the QD of quotient (QD score/ 
Median score) of the educational test battery with that of the intelli- 
gence test battery. These quartile deviations are in the ratio of 16 
to 24.35 

4. The Lippincott-Chapman Classroom Products Survey is pro- 
vided with EA norms and percentiles for grade.** From these the 
Oxq for grade can be computed, assuming normal distribution. For 
Grades V to VIII they are respectively 5.4, 8.2, and 8.7. These 
deviations represent, it will be noted, grade—not age distributions. 
However, certain evidence points to the fact that the grade IQ varia- 
bility and the age IQ variability are for the upper grades quite similar. 
Dearborn, for example, gives the following QD’s of IQ for age and 
grade based on his advanced group test of intelligence: ages 7 to 13 
respectively 7, 10, 10, 12, 11, 13 and 14; Grades II to VIII respectively 
8, 10, 11, 10, 11, 13 and 12.28 On the Terman Group Test of Mental 
Ability the QD’s of IQ for Grades VII to IX fall between 7 and 8 
which is the same variability as is found for the Stanford-Binet IQ 
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at the various ages. Supplementing this item with evidence cited 
previously (Table I) that the o’s of IQ for the Stanford-Binet and the 
Terman Group Test are very similar—for High School Freshmen 
being 12.6 and 13 respectively, and in addition having been shown 
for Grade VII, to be 15.4 and 14.4, for Grade VII2, 14.5 and 14,” 
we may perhaps conclude that the ratio of the Lippincott-Chapman 
EQ variability to the Stanford-Binet and Terman Group Test IQ 
variability is approximately 9 to 13. 

5. A small o,. was gotten by Kelley also for the Stanford-Achieve- 
ment Test for Grade VIII.** Converting the median and o scores 
given by him into EA’s by the use of the Stanford-Achievement 
norms gives for this grade a o,, of only 6.8, a value considerably smaller 
than the o,, usually found for this grade. 

I hope I shall not be understood to claim that the conclusion that 
bright children tend to have low AQ’s and dull children high 
AQ’s is entirely an artifact depending on the probable difference in 
magnitude of the EQ and the IQ units. This tendency would be 
expected from our present school organization. I do claim, however, 
that the quotient method of stating the relationship of educational 
to intellectual status tends to produce a difference between the bright 
and dull children in the direction actually found. For example, 
Beeson and Tope write that results ‘‘disclose the fact that those 
pupils whose intelligence quotients are above 100 almost invariably 
have lower accomplishment quotients, while those whose intelligence 
quotients fall below 100 have higher accomplishment quotients. In 
the 100 cases only three pupils whose intelligence quotients were 
above 100 made higher accomplishment quotients, and only three 
pupils whose intelligence quotients are below 100 made lower accom- 
plishment quotients. These six exceptions were all pupils whose intel- 
ligence quotients fell between 90 and 110 (7.e., close to the median).”’® 
Now Beeson and Tope had based their AQ’s on the Stanford Achieve- 
ment Test teamed with the Terman Group Test of Mental Ability. 
As stated earlier, the ratio of the standard deviations for these tests 
is approximately 10 to 14. It can hardly be denied that part, at 
least, of their findings is due to this inequality of the unit used in the 
numerator and the denominator of the AQ fraction. Pintner, on 
the other hand, who did not use the quotient method of estimating 
achievement but the difference in standard educational and intelligence 
ratings, based on the o as a unit in both cases, found that only 21 
per cent of the pupils showed marked inability in school attainment as 
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compared with their mental ability, and 47 per cent of these or only 10 
per cent of the total group were children classed as ‘‘mentally bright.”” 

The point made above is a simple one and can easily be checked 
by a comparison of the EQ and IQ variabilities for different ages 
and for different tests. Until this is done, however, one more ques- 
tion must be raised concerning the value of the AQ as an index of 
achievement. 

In conclusion I wish to say that I am fully aware that there are 
many who urge that despite theoretical objections, the quotient is 
valuable as a practical concept and that the errors in the tests them- 
selves outweigh the errors introduced by the use of a unit of specifica- 
tion which has not strict comparability in all cases. Nevertheless 
the fact must also be recognized that the quotient has an apparent 
numerical significance as a unit wherever it is applied and that all 
quotient scales whether comparable or not at any point other than 
the median are being interpreted as comparable throughout the entire 
range of values because of the similar nomenclature given to the unit. 
This leads inevitably in many instances to an incorrect interpretation 
and comparison of the quotients for different ages of one test and for 
different tests of the same or different abilities. 

On a program of reconstruction two plans are conceivable: 

1. To adopt for all test scores an arbitrary scale of values having 
a fixed zero and a fixed scale number or unit which is to be applied to 
each o or fraction of o of score above the zero. Rugg has suggested 
that on such a scale —2.5¢ be taken as zero, +2.5¢ as 100, and 10 
scale marks represent each 0.5¢. On this scale the median perform- 
ance, whether expressed in mental age, points or quotients, for each 
test and for each age of the test would be specified as 50, and to each 
o of score above and below the median an increment or decrement of 
20 would be ascribed.‘ McCall has adopted in his T-scale —5e of 
12-year-old children as the zero of the scale, +50 as 100, and a unit 
increment above 0 for each 0.1¢.42 There are, however, objections to 
be raised against the adoption of the 12-year-old performance as the 
basis of the scale. It would seem to me a more feasible plan, even 
though a more cumbersome one, to calibrate the performance at each 
age of the test to the adopted scale. 

2. As suggested earlier in the paper, it may be deemed advisable 
to retain the well-known quotient concept in its Stanford-Binet sense 
and with it the IQ classification based on this quotient, and arbitrarily 
to make the quotients for other tests conform to this numerically by 
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applying to them the factor which would render them the equivalent 
of the Stanford-Binet quotient. This factor would be based on the 
ratio of the o of the Stanford-Binet to that of the test in question. 
In this plan the starting point of the scale would be the median per- 
formance to which the value of 100 would be given. The Stanford- 
Binet IQ would be the unit of the scale and the o would equal the o of 
the Stanford-Binet distribution of IQ’s. To calibrate a scale of IQ’s 
to this standard for any other test, the o at each age for that test would 
have to be obtained and the factor determined for each age which would 
render it equal to the o of the Stanford-Binet test. This factor would 
then have to be applied to the difference betweeen 100 and the crude 
IQ obtained for the test in question. In this way each corrected IQ 
is made equivalent to the corresponding Stanford-Binet IQ adopted 
as standard and so becomes the numerical equivalent of this or any 
other quotient which has been similarly calibrated. Were such a 
plan adopted, it would seem wise to redetermine the o,, of the Stan- 
ford-Binet test for each age, basing the determination on the results of 
a larger group of children than was used by Terman and one more 
representative of American children in general. Herring, for example, 
is of the opinion that a Stanford-Binet o,. of 17 is not unusual for 
12-year old children.** I have found 13.5 to be the characteristic 
O1q Of the 7-year old children in the first grade of the public schools in 
the suburbs of Philadelphia. As the extremes of the 7-year-old 
intelligences were probably not in the first grade groups, this o while 
larger than Terman’s determination, is perhaps too small. Whatever 
revision of our current practice is to be adopted should not be decided 
on hastily or put into practice before the preliminary work of stand- 
ardization is satisfactorily completed. For this, a compilation of 
the Stanford-Binet data that has been gathered over the entire country 
would be invaluable. With these data in hand the adoption of a unit 
of measure to which all scales must conform and a reconstruction of 
the numerical concepts of our intelligence and educational testing 
could readily be accomplished. 
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THE ELIMINATION OF PRACTICE IN MENTAL 
TESTS 


L. T. MORGAN 


Assistant Lecturer, University of Bristol, England 


Introduction.—It is a well known fundamental difficulty in many 
experiments where certain specific functional changes are estimated 
by performance or output, that data are complicated by practice. 
In work on mental fatigue these complications are especially signifi- 
cant. Muscio writing in the British Journal of Psychology comes to 
the following conclusion: “‘the data showed one result conclusively: 
that experiments of this kind are absolutely valueless for the investi- 
gation of fatigue because of the complication caused by practice.’’! 
The first problem is therefore to obtain a satisfactory experimental 
method by which the effects of fatigue can be separated from the 
effects of practice. 

There are three main alternatives: 

1. Choice of tests, the performance of which is not affected materi- 
ally by practice or repetition. 

2. The elimination of practice by a prolonged period of repetition 
previous to the commencement of the experiment proper. 

3. The adoption of the method of ‘“‘equal groups.” With regard 
to the first, one might say that there are very few tests, if any, now in 
use, which are not influenced by practice. This is true both of 
“direct” methods—those which estimate changes in mental condition 
by mental work—and of “‘indirect’”’ methods—which estimate change 
in a certain functional activity by correlated variations of other 
functions. Practice influences computation as well as ethesiometric 
reaction. A few tests used, such as oral reading at maximum speed of 
numbers or of nonsense syllables, show little practice effect but it is 
doubtful whether they can be regarded as “tests” of mental efficiency, 
in any sense. Of the tests in current use, those most frequently 
chosen, like computation and cancellation, seem to be especially prone 
to perfection by repetition. At present, we can hardly look for a 
solution to the first alternative. 

Method of Repetitions.—(a) The method of eliminating practice by 
@ preliminary period of exercise is one that has been much used of 
late. It is well known that the curve of practice in such tests as 
computation and cancellation is generally of a certain form. One of 

1 British Journal of Psychology, Vol. X, Part 4, p. 328. 
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its chief peculiarities is that the increase in performance is very large 
at first, gradually becoming less till the graph becomes horizontal 
indicating that further practice at this stage means little or no improve- 
ment. It has been the custom by several present day workers in the 
field of mental fatigue to postpone controlled experiment till this 
point be reached. The object of the experimenter, usually, is not to 
eliminate practice completely by this method but rather to get rid of 
the ‘‘big effects’? shown in the first part of the graph. Should there 
be subsequent increase in skill this method of experiment is such as 
to eliminate, or tend to eliminate it. 

The following problems now present themselves: 

1. Does this horizontal portion occur uniformly for all subjects or 
group of subjects under the same conditions. 

2. Are we justified in assuming that this horizontal portion occurs 
after the same preliminary period of practice for all subjects in any 
one particular task 

One worker writes as follows: “‘A preliminary set of experiments 
with 10 subjects similar in class and age to the subjects described in 
this experiment, had shown that to bring them to a desired state of 
practice, it was necessary to do the multiplication, cancellation,...... 
tests nine times each...... To obtain this result, the average score 
in each test of the 10 subjects was expressed graphically and it was 
seen that the graph had become approximately linear when the above 
mentioned amount of practice had been done.’’! 

Here we have a case where it is assumed that the stage at which 
the horizontal portion of the practice curve occurs in 42 subjects 
(as indicated by average) will be roughly the same as that for another 
group of 10 chosen subjects, in a particular set of tasks. The only 
factors mentioned as common to the two groups are those of chrono- 
logical and pedagogical age. 

In other words, individual variation is though to be small enough to 
justify reference from one group to another. 

In our own work, we have met with very different results. With 
quite a number of subjects, both in cancellation and in computation, 
there has been pronounced practice effect after ten weeks daily testing. 
Wide variation, not uniformity, we have found to be the rule, both 
for task and subject. 

Until we know more about the correlation of such factors as ability, 
improvability, we fail to see that any arrangement can be satisfactory 





1 Phillips: ‘‘ Mental Fatigure.” P. 62. 
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except the tedious one of adopting a prolonged period of practice for 
each set of conditions, for each subject and task. 

(b) This makes it very difficult of application. As Muscio writes 
in the British Journal of Psychology, “‘It is in theory always possible 
to eliminate practice by prolonging an experiment until practice effects 
become inappreciable; but such procedure is hardly ever feasible 
because of the demands it makes upon the subjects.’’! In fact there 
is little opportunity anywhere for applying such a method, unless it 
be with school subjects. We have already mentioned however that 
the main purpose of a preliminary practice period is not to eliminate 
entirely the practice effect but to get rid of the large effects shown at 
the commencement of our graphs. The remainder is dealt with by 
the actual method of experiment. One very much used at the present 
time is the ‘‘ Method of Reverse Order.”’ If the experiment be one 
covering weeks of daily testing, then on completion, the original order 
of the experiment is reversed and the gross total for each two corre- 
sponding days’ performance, is obtained by addition. In such a case 
the initial reading will be added to the final reading; the two middle 
readings will also correspond. 

(c) In practice this is quite a convenient method, the only grave 
objection to it being that the experiment is made double as long as in 
the original plan. Theoretically however, there is the objection that 
it postulates a uniform linear curve of progression—which is not true to 
fact. Indeed we cannot fairly use it to “eliminate” practice unless 
it be taken for granted that we know the progress of the curve. If 
the range of experiment covers part of a plateau and part of a sharp 
rise, our data will suffer. If it cover a plateau alone, or instead a rise 
in the curve, the possibility is that we shall do better. 

Another important point needs emphasis here. Experiment 
seems to suggest that variability of performance is greatest at a high 
stage of practice. This means that the ‘‘method of reverse order”’ 
is more effective for portions early in the practice curve than for the 
later phases, though a higher stage of skill has been reached in the 
latter case. It would appear then, that reversing the order of experiment 
is not an effective device for extreme ranges of practice. As a method of 
eliminating practice effects 1t would appear that the device is more applic- 
able to stages of medium range. 

On this basis, Phillips has seemingly chosen a suitable time (nine 
days) for commencing the experiment proper, though, according to 


1 British Journal of Psychology, July, 1920, p. 328. 
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our data, it cannot be said that the curve is nearly horizontal at this 
stage of practice. 

(d) Significance of the practice period. There is another impor- 
tant fact to be noted whenever changes in efficiency are measured by 
output. A number of investigators have found that a preliminary 
practice period is a necessity not only to enable the experimenter to 
estimate and control the effects of practice but also to eliminate as 
far as possible the variable and arbitrary effects of unaccustomed 
tasks and surroundings. 

Reference may be made to Wyatt’s and Weston’s attempts to 
frame a test of industrial fatigue involving a task of the same nature 
as the fatiguing work itself. They conclude, “Even though the 
operations involved in the test are similar to those which the winders 
are in the habit of performing in the course of the ordinary winding 
operations, about three weeks must elapse before the winders become 
adapted to test conditions. The effects of practice are distinctly 
noticeable throughout the test period, and the least variation from the 
usual conditions of labor has a disturbing influence upon the results.’ 
There is no reason to assume why some such phenomenon should not 
obtain in pedagogical research. 

Possibly there is a physiological explanation in the glandular hyper- 
secretions during times of stress and excitement. Elliot? has shown 
that in animals no greater excitement is needed to induce higher 
discharge of adrenalin than the strangeness of new quarters—and 
adrenalin has an effect upon the fatigue situation. Although refer- 
ence here is primarily to muscular fatigue, Gruber* suggests that its 
effect upon nervous elements cannot be denied. Swift quite rightly 
points out that “If the blood circulation of the brain is controlled by 
the autonomic system (and there is evidence for this), then the tonic 
effect of adrenalin, already demonstrated in muscular fatigue, may be 
operative in mental activity.”* This means that conditions which 
are new and strong affect materially the normal fatigue reaction. 

In any case experiment seems to show that the result obtained in 
industry is equally evident in school. From this point alone it would 
seem necessary to make use of a practice period for tests involving 
estimation of output as indicative of changes in efficiency. 





1 British Journal of Psychology, July, 1920, p. 306. 

2 Journal of Physiology, Vol. 44, p. 409. 

3 American Journal of Physiology, Vol. 32, p. 221. 

4 Swift: ‘Psychology and the Day’s Work.” P. 104. 
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Method of Equal Groups.—The method of equal groups consists in 
obtaining first, sets of subjects equal in number and in performance, 
A preliminary experiment for determining this is usually made. Then 
if only two groups are used the first in order of merit goes into Group 
A while the second goes to Group B—this procedure being applied 
till the groups are complete. If now one set is tested before work on 
an occasion or a series of occasions, and the other after work, other 
conditions being constant for both, we presume that the element of * | 
practice has been allowed for, both groups having had the same number 
of performances. Schuyten is said to have been the first to use such 
a method. Of our modern investigators Winch, Muscio, and others 
have found the method to give reliable results. Muscio is particularly 
confident that it serves to eliminate the effect of practice. He writes 
thus: ‘‘ The first problem therefore was to obtain a satisfactory experi- 
mental method by which the effects of fatigue could be separated from 
the effects of practice.' After discussing the inadequacy of the 
method of repetitions he goes on thus: “Is there then any other method 
by which practice can be eliminated? Such a method is the ‘‘method 
of equal groups.” 

It has been urged that the great objection to this method is the 
difficulty of obtaining groups which exactly equal one another in per- 
formance. Muscio obtains groups which for total performance 
resemble one another as closely as 2289/2881 and another pair 
2966 / 2965.75. 

This arrangement cannot be objected to on the score of inequality. 
A more serious objection is brought to light if we consider the terminol- 
ogy used. The groups are said to be “equal in number and in the 
capacity for a given test.” What is meant by “capacity?” Is it 
“ability?” If so, then we must distinguish it from ‘‘improvability.”’ 
In other words, we must not assume that because two individuals or 
two groups are equal in ability of performance after one trial, that 
they will again be equal after two and especially after four or five. 
Muscio divides his one pair of groups on the basis of a fourth perform- 
ance in a given test. But we are not quite sure whether a fourth test- 
ing is any more likely to eliminate the danger than a first. 

The following table gives performance in cancellation for six sub- 
jects under identical condition, one being initial performance the other 
being the twenty-third. 





1 British Journal of Psychology, July, 1920, p. 328. 
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No. First TWENTY-THIRD 
1 815 2557 
2 437 2195 
3 688 2075 
4 640 2197 
5 601 1987 
6 689 2049 


It will be seen that No. 2, although much lower at first than No. 6. 
is actually 146 higher in the other reading. The same is true in refer- 
ence to the fourth subject. In Fig. 1 we give the practice graphs 
for No. 1, 2, and 6. No. 6 in initial ability resembles No. 1 more 
closely than he does No. 2. It will be seen however that from the 
ninth repetition, No. 6 and No. 1 are widely different while No. 6 
and No. 2 are very similar. Indeed, No. 6 who was so much superior 
in initial ability to No. 2, is found after the eighteenth performance 
to be considerably below him. 

With respect to individuals, it may be said that two factors are 
important in this connection, as determining future performance. 

1. The stage of practice already reached. 

2. The relative improvabilities of the individuals concerned. 

For the specific tests usually used, all subjects are generally 
assumed to be at the same stage of practice—practically zero. With 
regard to the second factor, no account is taken of it since the group 
division is usually based on “‘ability.”” But graphs of the perform- 
ances of individuals show clearly that the relative abilities at one stage 
of practice will not necessarily obtain at another stage in the acquire- 
ment of skill. 

What are the conditions under which two groups equal, let us 
say, for the third performance remain equal for the subsequent 
performances? 

1. There is the possibility of equal improvement throughout. 

2. Corresponding individuals in each group may improve at the 
same rate. 

3. And, there is the possibility of chance improvements making 
the totals equal. 

The last possibility, if true to fact, cannot be utilized in experi- 
mental technique. The second assumes that ability is a measure of 
improvabil‘ty; that for every standard of initial attainment there is a 
correlated improvement. In essence it means high correlation 
between ability and improvability in the particular function itself. 








ind 
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The first possibility means that improvement is constant for all 
individuals and independent of ability. 

The method of equal groups clearly assumes that either the first 
or the second relationship is true. What does investigation report 
as to these relationships? There is unanimous opinion among investi- 
gators that individuals vary greatly in improvability; and that 
improvement certainly is not constant for all individuals. With 
regard to the relation between ability and improvability, as can only 
be expected, there is divergence of view. The general trend of the 
findings are summed up by Race: “‘the ability possessed by any person 
at any time is, in a large measure, a product of what native capacity 
he has and a prophecy of what further improvement he will make from 
a given amount of practice.’’! 

It is clear, however, that in this connection what we want is not a 
general statement about the relationship of improvability and ability, 
but an indication of the closeness of this relation at various stages in 
the practice curve—information which would perfect our technique 
in experiment. 

With a view to obtaining some light on the use of an addition test 
used in fatigue work, we recorded the performances of 35 children, 
between the ages of 13 and 14, for 20 daily periods of 10 minutes 
duration. The whole work was done under carefully arranged experi- 
mental conditions, using incentives to obtain maximum effort. 

The following are some of the results: 


RELATION BETWEEN ABILITY DISTRIBUTION AT VARIOUS STAGES OF 








PRACTICE 
Performance taken Average of three Correlation between 
as basis following performances them 
1 2, 3, 4 .84 (.03) 
4 5, 6, 7 .98 (.01) 
7 8, 9, 10 .98 (.01) 
10 11, 12, 13 .99 (.01) 
13 14, 15, 16 .99 (.01) 
16 17, 18, 19 .99 (.01) 











It is clear that those who are at a certain position in the ability 
distribution tend to retain their position with further practice. 
Increase in skill tends to make this all the more probable. 





1 Race: Improvability. Columbia University Contribution to Education, p. 34. 
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If we estimate the improvability at different stages of the practice 
curve we get the following table: 


RELATION BETWEEN ABILITY AND IMPROVEMENT AT DIFFERENT STAGES IN THE 


PRACTICE CURVE 
PERFORMANCE USED 


a8 Basis CoRRELATION 
1 Improvement represented by .50 (.09) 

4 the gain of the average of .75 (.05) 

7 the three following perform- .32 (.11) 
10 ances over the initial per- .07 (.12) 
13 formance (Column 1). .07 (.12) 
16 .05 (.12) 


From the figures, as well as from perusal of individual curves, one 
might say with some probability that during the earlier portions of 
the practice curve, actual performance is indicative of the measure 
of further improvement. This condition is however subject to change. 
At a later stage in the practice curve ability or the measure of skill 
bears no relation to subsequent improvement. It appears probable 
that at this stage, the high abilities temporarily are subject to a low 
rate of improvement while the lesser abilities retain their original rate. 

From a prolonged study of individual differences in the rate of 
learning of cancellation and computation tests, we would say that it 
is not at all improbable that though the able subjects show a very 
rapid rate of skill acquirement in the earlier stages, their curves are 
characterized by plateaus. The lesser abilities on the other hand, 
though showing no rapid rate of initial improvement, may possibly 
maintain a steady improvement throughout. But it would be unwise 
at this stage of experiment to make generalizations. The important 
point is that there is a tendency for performance to be associated with 
the amount of improvement in the earlier phases of a practice curve. 
It is to be noted that the equal-group method is generally used for 
this earlier phase. 

The question arises whether ability or improvability should be 
the basis of division of subjects into equal groups. If we take, for 
instance, the fourth performance of a number of subjects, arrange them 
on a basis of ability and make two equal groups of them, the subse- 
quent groups for the fifth, sixth, seventh,—tenth are only approxi- 
mately equal. The careful equalizing therefore of groups involving 
redistribution of the mathematical order is superficial. It is even 
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ce inadvisable since, of the two desiderata, we should choose the ability 
distribution itself to be the basis of equality. 
We have found, however, that if the groups be arranged on an 
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scores as an index of improvability since all subjects require this 
amount of practice in getting accustomed to the test. The first two 
or three scores cannot be normally considered to fall within the law of 
the curve for every individual. One would, therefore, have to resort 
to the method of division on the basis of the fourth score itself. 

Theoretically, and for the best experimental control, we need to 
know the stage during the practice curve when the correlation between 
score and improvement is closest. For this interval alone, in principle, 
are we justified in applying the “equal groups”’ method. 

Practically one may take it that it works conveniently in experi- 
ment subject to certain reservations. 

1. The ability distribution given by a particular test is true only 
of that particular state of practice. 

2. The less the number of attempted performances subsequent 
to accepting this distribution, the more likely is the method to be 
effective. 

3. If a number of tests are to be made, change of distribution by 
practice will make itself evident and new trials are necessary to adjust 
the differences. 
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INTELLIGENT TESTING 


Psychological Tests for College Freshman. L. L. Thurstone. The Educational 
Record, Oct., 1925, 282-294. This is a report which summarizes the results of 
9 psychological tests from 60 colleges and gives tables of norms for these tests. 

A Quantitative Study of the Results of Grouping First-grade Classes According to 
Menial Age. Grace Arthur. Journal of Educational Research, Oct., 1925, 173- 
185. The reading achievements of mental age groups in seven first grade classes 
show that the duller pupils make the greatest relative gains during the year. 

The Multi-mental Scale. William A. McCall and His Students. Teachers 
College Record, Oct., 1925, 109-120. Special features of this scale are pointed 
out, together with its methods of validation and use. 

The Intelligence of Southern Negro Children. Thomas R. Garth and Calvin A. 
Whatley. School and Society, Oct. 17, 1925, 501-504. Scores for 1272 negro 
children on the N.I.T., in grades 3 to 8 inclusive, show that the average MA is 
2.6 years below the average CA. 

The Selective Value of Mental Tests. W.S. Guiler. The Elementary School 
Journal, Oct., 1925, 112-115. This is an attempt to learn what amount of intelli- 
gence is necessary for successful work in a teachers college. 

A Method of Reporting the Significance of Intelligence Tests to Parents and 
Teachers. Marvin L. Darsie. School and Society, Nov. 7, 1925, 597-600. A 
discussion of the values of such a report and a form for making the report under- 
standable is included. 

The Mental Capacity of Sixth Grade Jewish and Italian Children. Dorothy 
Wilson Seago and Theresa Shulkin Koldin. School and Society, Oct. 31, 1925, 
564-568. An analysis and comparisons of the responses of 800 Jewish and 452 
Italian twelve-year-old boys in 6B grade in New York City on the N.I.T. 

Meeting the Need for Improved Measures to Be Used in the College Guidance 
Program. Glen U. Cleeton. Educational Administration and Supervision, Oct., 
1925, 489-494. Research is suggested along the line of evaluation of existing 
measures in this field. 

What Is Reasonable in Testing? C. A. S. Dwight. The Journal of Educa- 
tional Method, Oct., 1925, 60-62. The author points out 10 cautionary reserva- 
tions in the use of tests. 

Intelligence of Best and Poorest Pupils. Charles L. Harlan. Educational 
Administration and Supervision, Oct., 1925, 495-499. A comparative study of the 
mental ability of 329 pupils whom teachers picked out as best and 290 pupils 
picked out as poorest. - 
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Relation of Intelligence to Vocabulary and to Language Training. W. Hardin 
Hughes. The English Journal, Oct., 1925, 621-626. The author reports, for 
junior college students, a considerably closer relationship between intelligence and 
vocabulary ability than between intelligence and the total number of years com- 
pleted by students in high-school English and foreign languages. 


ACHIEVEMENT TESTING 


Some Implications of the Revised Van Wagenen History Scales. Marvin J. 
Van Wagenen. Teachers College Record, Oct., 1925, 142-148. This paper points 
out the underlying assumptions and practical advantages of these revised scales. 

Is There Value in the Final Examination? T. H. Schutte. Journal of Edu- 
cational Research, Oct., 1925, 204-213. The data show that the achievement of 
the examination group is superior to that of the non-examination group. 

A General Science Test. Herbert A. Toops. School Science and Mathematics, 
Nov., 1925, 817-822. The test is composed of one-word-answer questions. 
Methods of construction and norms for University Freshman are given. 

Some Experimental Comparisons of True-false Tests and Traditional Examina- 
tions. C. C. Crawford and D. A. Raynaldo. The School Review, Nov., 1925, 
698-706. Twenty comparisons between the two types of examinations, in the 
case of 14 different classes, are given. In 15 of these, the experimental coefficients 
favor the traditional type. 


PsycHOLOGY OF LEARNING AND OF SCHOOL SUBJECTS 


Improving Speed and Comprehension in Reading. Clara B. Springsteed. The 
Journal of Educational Method, Oct., 1925, 48-52. This is a discussion of the 
methods and a statement of the results of an intensive campaign to improve speed 
and comprehension in reading with a large number of pupils from foreign language 
homes. 

Class Results with Spaced and Unspaced Memorizing. Kate Gordon. Journal 
of Experimental Psychology, Oct., 1925, 337-343. Four groups of college students, 
to whom materials were presented in different ways, are compared as to immediate 
and delayed recall. 

The Teaching of Language Forms. Clara McPhee. The Elementary School 
Journal, Oct., 1925, 137-146. A list of languages forms for the lower grades and 
the experimental grade placement of these forms. 


CHARACTER AND PERSONALITY 


Character Trait Tests and the Prognosis of College Achievement. Othniel R. 
Chambers. The Journal of Abnormal and Social Psychology, Oct., 1925, 303- 
311. The Pressey X-O Tests seem to predict achievement in college about as well 
as a group intelligence test. A combination of the two increases the predictive 
value only slightly. 

Through Our Own Looking Glass. J. S. Kinder. School and Society, Oct. 
24, 1925, 533-536. Forty-two college women rated themselves on 30 traits. At 
later intervals they gave ratings for the “average” and ‘“‘ideal”’ college women and 
the ‘“‘average”’ college man. The traits and comparisons of the results are given. 
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TEACHERS’ MARKS 


Economy and Fairness in Marking. Stephen G. Rich. The Journal of Edu- 
cational Method, Oct., 1925, 67-70. A procedure is presented which increases the 
fairness to pupils in assigning marks. 

The Predictive Value of College Marks in Medical Subjects. E. A. Bott. Jour- 
nal of Educational Research, Oct., 1925, 214-227. The use of college marks as an 
objective criterion for substantiating the predictive value of test scores is dis- 
cussed in the light of data for two college generations. 
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THE INTERRELATIONS OF HUMAN TRAITS AND Group ACTIVITIES 


Social Psychology by Knight Dunlap. Baltimore: Williams & Wilkins 
Co., 1925. Pp. 261. 


In reading Dr. Dunlap’s recent book, the reviewer found several 
characteristics both in method of treatment and in conclusions reached 
that impressed him distinctly favorably. In the first place the book 
—unlike many others in social psychology—is not primarily an 
extended defense of some single principle of psychology nor is it essen- 
tially an elaborate superstructure which would necessarily collapse 
with the demonstration of a defect in the underlying system. Unlike 
certain recen. books, moreover, it is, with exception of less than 10 
pages, exclusively devoted to the problems of social psychology— 
social activities dependent upon sex differences, marriage and family 
life; the nature of religious, civic, martial and other organization; 
the conditions of social stimulation and progress; the principles of 
social organizations. The author, to be sure, could not escape taking 
some definite psychological point of view but this standpoint is treated 
with dispatch in a very brief introductory chapter. 

Dunlap’s viewpoint is substantially as follows: Social psychology 
is the study of human grouping and the analysis of the mental factors 
involved in them. Human groupings are the manifestations of the 
social characteristics of man. Social psychology, then, undertakes 
to correlate group activities and human characteristics, to interpret 
the one in terms of the other. The author begins by stating some of 
the human characteristics which function in social life. 

While accepting the reaction hypothesis as the fundamental 
explanatory principle, the author avoids the dilemmas of the radical 
behaviorists. ‘‘Some of the responses of the organism are ‘con- 
scious:’ that is to say, that in these responses, or through them, or by 
them (it makes little difference what expression we use), the ego 
observes, or takes note of, or is aware of external objects, or of its own 
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organism, or of the relations involved in either of them or between 
them” The concept of instincts as ultimate psychological or bio- 
logical entities is rejected although “instincts as mere logical classifica- 
tions of human behavior”’ is retained though little used. To feelings 
and emotions as significant human characters of value in interpreting 
social behavior, Dunlap adds “‘desires’’—such as alimentary desires, 
excretory desires, desires for rest, desires for activity, for shelter, for 
conformity, for preeminence, for progeny, for sex gratification—which 
“unlike the instincts are not mere interpretations but actual facts of 
experience.’”’ Whether the desires listed represent merely “‘a con- 
venient logical division, or whether these are really distinct types, 
biologically dependent on different tissues or different processes, 
remains to be determined; but there is a strong probability of truth in 
the latter supposition although it is not to be assumed that the list as 
given is a final one.’”’ Concepts of the “‘unconscious mind” together 
with the correlated principles of explanation are rejected, while the 
most widely accepted theories of organic heredity, of learning capacity, 
intelligence, and the like are accepted. 

Utilizing such explanatory principles as these, Dunlap immedi- 
ately proceeds to his main task, the description and explanation of 
group activities. As preparation for a discussion of the family group, 
he gives a detailed treatment of sex differences—on the whole a most 
excellent piece of work. The author stresses his conviction that the 
main differences between men and women result from the differences 
in the sex functions. Inferiority in strength and physical capacity, 
the latter due mainly to the debilitating effects of the periodic function, 
profoundly influence not only achievement but purposes, moods and 
all social adjustments. Differences in the nature and expression of 
sex desires and the ways in which they are gratified are suggested as a 
significant source of maladjustment in marriage. Dunlap’s views here 
lead to a theory of sex education which should be of vital interest to 
students of education. It is proposed that sex instruction be not 
limited to mere physiology of sex functions and to the traditional 
type of sex hygiene, but that it include specially the psychology of 
sex desires, aversions and mental adjustments with emphasis on the 
differences between men and women. 

The family, religious, civic and martial organizations are treated 
at length in three chapters. The author considers the theories con- 
cerning the origin of the organization, such as the sex theories, ani- 
mism, etc. in the case of religion, and offers his own solution—such as 
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is suggested in the statement that ‘‘the only theory of religion which 
today seems to have value as a scientific hypothesis is the theory that 
religion has its origin and its support in dissatisfaction with life result- 
ing from reflection on the failure of life to satisfy the primary desires 
of man.” This thesis is then elaborated; the forms of religious action 
and belief explained; this association of religion with magic and ritual 
suggested. Finally, the social values and other effects of religious 
activities are discussed. In the treatment of other social organiza- 
tions, a similar topical outline is followed. 

In the chapter on ‘‘Conditions of Social Progress’? Dunlap writes 
a critical and able account of possibilities of social betterment. In 
the main this chapter is concerned with the possibilities of improv- 
ing the human stock by applying the facts of organic heredity. 
The theories and limitations of eugenics, sexual selection and birth 
control are critically reviewed. Some advocates of the eugenics pro- 
gram may be surprised, in the trend of Dunlap’s discussion which is 
not specially optimistic, by such statements as these: ‘“‘Physical 
superiority is more plausible, at present, as a sign of eugenic fitness 
than is mental superiority,’ ‘‘the program of eugenics, for the present, 
must be concerned largely with the elimination of (a few types of) 
undesirables.” 

In a chapter on “Principles of Social Organization” the author 
gives an extended treatment of certain topics largely neglected in 
some other recent books on social psychology, namely, the unique 
characteristics of organized group action, the results of working as a 
group as distinguished from working in a group. Here are discussed 
such topics as the significance of social consciousness and feeling, of 
various forms of communication, of imitation and suggestion, of 
language and culture. A section is devoted to the differences between 
the crowd, the mob, and “higher” forms of social organization such as 
the well developed committee or brotherhood. A final section of this 
chapter deals with forms of social education and control of conduct, 
with the analysis of propaganda extended to form the last chapter in 
the book. 

In all of these chapters are to be found many fresh suggestions. 
Although the explanatory principles are drawn largely from psychology, 
factual materials in abundance have been assembled from other 
sources, especially from biology, anthropology, sociology, ethics 
and history. Although much of the treatment is speculative and 
dogmatic—as the author warns the reader in the Preface that it would 
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be—it maintains a fairness and breadth in viewpoint that could 
scarcely be achieved by a writer attempting to defend a single explana- 
tory thesis. Being a survey rather than a system of social psychology, 
the book should be very serviceable as a text. Written with clarity 
and directness with few unusual terms, the volume, while containing 
much substance for the advanced student, should be quite intelligible 
to the beginner. No references are listed in the text, but an appended 
bibliography contains about 25 titles of general treatises. The account 
differs so greatly in content and method of treatment from the excellent 
recent book by Allport that the two can scarcely be considered as 
competitors. Rather each is a supplement to the other; both are 
needed to cover the field fully. ARTHUR I. GATEs. 
Teachers College, Columbia University. 


EXPLORATION IN THE FIELD OF CHARACTER TESTING 


Discovery and Development of Tests of Character, by Theodore F. 
Lentz, Jr. Teachers College, Columbia University, 1925. 
Pp. 47. 


The pioneer efforts of Dr. Lentz and others in the discovery and 
development of methods in testing character may be compared to 
those of Binet in the field of intelligence measurement. After several 
years of experimentation, Binet emerged with a series of tests which 
differentiated the various age levels from each other, whereas Dr. 
Lentz starts off with two sociologically contrasted groups, and 
searches for “objective verifiabel differences between delinquent 
and non-delinquent boys.” These two groups were subjected to 
a wide range of tests, and those which showed promise of differen- 
tiating the two types of behavior were checked by further experi- 
mentation on three widely separate pairs of delinquent and control 
groups. 

Thus Dr. Lentz has made a contribution not only in offering the 
tests which were finally retained after his sifting process, but in suggest- 
ing a criterion of selection, the value of which he has attempted 
to check. He concludes that the general method which he used, 
“namely, the intensive, objective comparative study of contrasted 
groups, while not fully evaluated by this experiment, gives promise 
of yielding positive results.”’ Guapys C. SCHWESINGER. 
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PRINCIPLES OF SALESMANSHIP AND BUSINESS LEADERSHIP 


Business Power through Psychology, by Edgar James Swift. New 
York and London: Charles Scribner’s Sons, 1925. Pp. 397. 


The title of this book is suggestive of its content only in a general 
way. It deals with the psychology of salesmanship and the principles 
of industrial management, and may be read profitably by any business 
man or salesman who is interested in gaining sound psychological 
information regarding human nature especially as this touches upon 
his own immediate problem in dealing with men. The book itself, 
which is very readable and excellent in its literary style, might furnish 
the best example of salesmanship in view of the technique of persuasion 
and the manner of procedure employed by the author. It is further 
to be recommended as a good antidote for shoddy and counterfeit 
psychology on “The Subconscious Mind,” ‘Character Reading,”’ 
“High-power Psychology,”’ and the like. 

The first chapters treat the general principles of the simpler sales 
situation showing how it may like behavior in general be conceived in 
terms of stimulus and response. The following chapters give special 
attention to the subtler and more difficult situations requiring tact 
and skill. Suggestion and other principles of persuasion are discussed; 
emphasis is placed on the need of getting into close rapport with the 
prospect. The efficient salesman adapts himself to the total situation, 
the elements of which are all important in forcing a decision; he does 
not allow the rules and maxims of the sales-manual to handicap his 
ingenuity. The total situation includes not only specific information 
regarding the commodity, but the needs of the prospective buyers 
in his territory, the particular difficulties and deterrents standing in 
the way of any individual buyer, and the variety of immediate 
environmental factors which influence the customer. 

The second half of the book deals with the problems of the executive 
and general management, with constructive policies for extending 
sales, with the enormous expenses incurred through rapid turn-over 
of workmen, salesmen and personnel, and with scientific methods 
for reducing labor turn-over. In the last chapters under the heads 
of ‘Thinking as an Asset to Business,’’ ‘‘ Mental Efficiency,” and ‘‘The 
Psychology of Leadership,” attention is called to the necessity of close 
deliberation before entering upon a business project; the importance 
of guarding against the tendency for ideas to crystallize and make 
new ideas unwelcome; the danger to functional plasticity incurred 
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through such obstacles to thinking and through “fundamentalism” 
in any of its forms; in short, the need of greater resourcefulness, 
adapability, self-reliance and initiative, which are the supreme 
prerequisites of leaders of men. 

It is probably the author’s recognition of the misfortunes of 
worshipping the platitudinous stuff of the sales-manuals and the cut 
and dried formula of the sales talk, that has tended to make his treat- 
ment less systematic and constructive. However, his attractive 
style and frequent literary allusions make his book very interesting 
reading. FREDERICK H. Lunp. 

Barnard College Columbia University. 





ADAPTING EDUCATION TO FEEBLE-MINDED ADULTS 


Education of Feeble-minded Women, by Mary Vanuxem. New York 


City: Teachers College, Columbia University, 1925. Pp. 74. 
Cloth. 


This treatise is an account, simply and directly given, of the 
endeavors at Laurelton State Village to find an educational program 
adapted to the needs of adult feeble-minded women, many of whom 
had had outside worldly experience. The curriculum was considered 
from three angles; academic, industrial and moral; the pupils were 
viewed as to their permanency of residence in the institution, or 
their ultimate placement in extra-institutional life. A desire for 
education, created by well-timed propaganda, in conjunction with 
individual instruction, enabled the girls to make rapid progress in 
their formal school work. The most interesting part of the study con- 
cerns itself with the tables of industrial capacity which were developed 
at the Village. These show distinct changes from previously accepted 
standards. The author attributes this to three causes: (1) Increased 
motor coordination, (2) more extended worldly experience, and (3) 
intensive individual training. A modified system of self-government 
partially solved the problem of moral and disciplinary guidance. The 
inclusion of a Personality Work Sheet in each girl’s record was found 
to be not only helpful, but necessary, in planning her continued educa- 
cational program and ultimate placement. The author concludes 
that the full measure of training can, in all probability, be given with 
profit only to the girl who is socially-minded. 


Guapys C. SCHWESINGER. 
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Tue HicH ScHOOL AND THE CURRICULUM 


Curriculum-adjustment in the Secondary School, by Philip W. L. Cox. 
Philadelphia: J. B. Lippincott Co., 1925. Pp. 285. 


Changes in one aspect of our social heritage call for adjustments 
in other parts. Changes in industry, for example, call for changes 
in education. Generally there is a period of maladjustment following 
rapid changes in material culture. The time of this period can be 
lessened by intelligent curriculum adjustments. Thecurriculum maker, 
therefore, must be a social analyst, ‘‘a student of human nature and 
human affairs.’”” He must do more than study the present. He 
must do better than ‘“‘curriculum-tinkering.”’ He must as Dr. Rugg 
often reminds us make a prevision of our needs in ‘‘ The Great Society ”’ 
and reconstruct our curriculum accordingly. This, in brief, describes 
the author’s point of view towards curriculum-adjustment. 

In the words of the author ‘“‘the book is written to promote the 
realization of the conception of the school as a purposefully controlled 
and idealized community.” 

The book is divided in three parts. In Part I, the present situa- 
tion in the secondary school is described. In Part II, called ‘“‘The 
Scientific Basis of the Secondary School”’ procedures of curriculum 
modifications are considered. Part III is captioned ‘Principles of 
Secondary Curriculum Adjustment.” Some of the questions con- 
sidered here are: How should the school connect with life? What 
organization best achieves our ends? What should be the basis of 
common experiences? What should be the basis of elective choices? 
What does graduation mean? 

In the opinion of the author secondary schools in a democracy 
should not serve as selective agencies with the basis of selection largely 
abstract intelligence. Like Professors Inglis, Bode and others he 
favors ‘“‘the establishment of core-curriculum courses from which all 
pupils of whatever mental or social status could profit.” 

This book, written in a straightforward and clear style, should be 
helpful to students of secondary education; especially to those who 
have not had the good fortune to do graduate work with Dr. H. O. 
Rugg, Dr. W. H. Kilpatrick and Dr. T. H. Briggs. A well-selected 
bibliography at the end adds to its usefulness. H. Meuzerr. 
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CORRECTION 


Through error the caption in the second main column of Table 
II in Professor Odell’s Article on the Effect of the Previous Testing 
of Scores in the October issue should have read: “IQ’s computed on 
assumption that mental growth ceases at 16 years’’ while the caption 
for column three should have read “IQ’s computed on assumption 
that mental growth ceases at 14 years.” 
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